List Info

Thread: Embedded Experts: Fix Code Bugs Or Cost Lives




Embedded Experts: Fix Code Bugs Or Cost Lives
user name
2006-04-11 05:20:25
http://www.informationweek.com/news/sho
wArticle.jhtml?articleID=185300011

By Rick Merritt
EE Times 
April 10, 2006 

San Jose, Calif. - The Therac 25 was supposed to save lives
by zapping
tumors with targeted blasts of radiation. Instead, the
device
delivered massive overdoses that killed three patients and
injured
several others because of software glitches by a lone
programmer whose
code was never properly inspected and tested.

The Therac 25 was just one of dozens of examples cited by
speakers at
last week's Embedded Systems Conference here to drive home
a point:  
People's lives as well as millions of dollars in
investments often
depend on software engineering, but too many projects fail
for lack of
good programming discipline and management support.

And the problems may get worse as programmers face the
additional
challenges of handling multicore devices. Indeed, an annual
survey of
several thousand embedded engineers polled recently by EE
Times and
Embedded Systems Design magazine showed that the need for
better
software debug tools is a major concern, with test and debug
taking up
more time than any step in a project development.

"This is the only industry left where we can ship
products with known
defects and not get sued. How long do you think that will
last?" asked
Jack Ganssle, a consultant and author who presented a class
on lessons
learned from embedded-software disasters.

"We aren't afraid of software, but we need to be,
because one wrong
bit out of 100 million can cause people to die," said
Ganssle, who
said he has worked on more than 100 embedded projects,
including the
White House security system.

"As embedded systems grow in complexity, the software
becomes an ever
more important piece. Right now, 50 percent of our DSP
spending is on
the software side," said Gerald McGuire, general
manager of the DSP
group at Analog Devices Inc. (Norwood, Mass.), which employs
more than
200 software engineers.

As software grows in importance, it is not necessarily
becoming more
reliable. According to one report, 80 percent of software
projects
fail because they are over budget, late, missing key
features or a
combination of factors. Another report suggests that large
software
systems of more than a million lines of code may have as
many as
20,000 errors, 1,800 of them still unresolved after a year.

"We can't get rid of faults," said Lorenzo
Fasanelli, a senior
embedded-software specialist for Ericsson Labs in Italy. But
engineers
can speak up about faults, learn from them and rewrite code
to
proactively find and minimize them, he added.

"We cannot advance the state of the art without
studying failure,"  
said Kim Fowler, an author and systems architect who
delivered an ESC
talk called "Fantastic Failures."


War stories

There are plenty of failures from which to learn. Ganssle
cited
another radiation system that killed 28 people in a series
of tests in
Panama in May 2001 before the U.S. Food and Drug
Administration shut
down the company that made it. Inspections of software after
the crash
of a U.S. Army Chinook helicopter revealed 500 errors,
including 50
critical ones, in just the first 17 percent of code tested.

"Why did they inspect software only after people
died?" asked Ganssle,
who said a court case on the crash is still in litigation.

Some pacemakers have stimulated hearts to beat at rates of
190 beats a
minute, prompting companies to provide software updates
delivered to
the implanted devices using capacitive coupling.
Unfortunately, other
pacemaker patients have had their devices inadvertently
reprogrammed
when walking through metal detectors. In 2003, the pacemaker
of a
woman in Japan was accidentally reprogrammed by her rice
cooker.

A Thai politician had to have police bust the windows on his
BMW 745i
after a software glitch caused the electric doors and
windows to
freeze in a locked state, trapping him inside. Ford recalled
some
models of its 2000 Explorer because lights and wipers would
not work
in some circumstances. And the 2004 Pontiac Grand Prix faced
a
software recall for a leap-year fault.

Part of the problem lies in poor engineering discipline,
such as a
lack of adequate testing, improper error handling and
inherently
sloppy languages. Management issues, including a demand for
ever more
features in compressed schedules, and tight budgets are also
to blame.

"We need to test everything up front and integrate
testing into the
design process. Then we need to believe the data we get when
we do
test," said Ganssle.

When engineers make a change because of a failed test, they
often
neglect to go back to the beginning of the test suite to
make sure the
changes haven't introduced new errors, said Dave Stewart,
chief
technology officer of Embedded Research Solutions Inc.
(Annapolis,
Md.) in an ESC session on the top problems in real-time
software
design.

Engineers need to create error-handling modes in their
programs, and
the modes must exist as just another state for their systems
and treat
errors as one of many possible inputs, Stewart added.

Fasanelli of Ericsson gave a detailed prescription for how
to find,
report and minimize faults in embedded software. Programmers
must make
it a standard practice to classify all inputs and states of
a system
and note any illegal inputs or edge states, whether or not
they affect
a program's ability to run, he said.

In addition, programs should routinely track and report
their own
performance, idle times and memory integrity. Creating such
debug
features may affect a system's cost, but that will be
offset by
reduced maintenance, Fasanelli said.

"Exception handling is particularly hard to test
because it's hard to
generate the exceptions. These tend to be the most poorly
tested parts
of code," said Ganssle.


Riding a rough C

Ironically, today's most popular programming languages, C
and C++, are
among the most error prone. That's because C compilers have
plenty of
latitude to compile and link - without providing any
diagnostics —
code that can produce serious run-time errors, especially
when ported
to a new processor.

"There are a lot of little goodies in C that
programmers are not fully
aware of," said Dan Saks, an author who has documented
nearly 40
"gotchas" he presented in a session at ESC.
"The lesson is to
understand what you can assume and what you can't."

For instance, C doesn't define the number of bits in a
byte, though
header files can query a processor and adjust the program if
the CPU
does not support the usual 8-bit byte. Likewise, the common
practice
of subtracting pointers can result in creating a character
of an
undefined type, said Saks, president of consulting firm Saks
&
Associates (Springfield, Ohio).

"The use of C is really criminal," said Ganssle.
"C will compile a
telephone directory, practically. I guess we use C because
we think
debugging is fun."

For every 1,000 lines of code, C can generate 500 errors in
a worst
case, 167 errors on average or 12.5 mistakes for
automatically
generated code, said Ganssle. That compares with 50 errors
worst case,
25 average and 4.8 for auto-generated code using the Ada
language, he
said. The Spark language emerging from Europe is even
better,
generating just four errors on average per 1,000 lines of
code, he
claimed.

C is used in half the development projects done today,
according to
the results from the 2006 Embedded Market Survey, the 14th
such annual
poll of engineers working on embedded-design projects. The
survey
showed that the C++ programming language is gaining in
acceptance,
however.

ESD editor-in-chief Jim Turley, who presented the annual
embedded-market survey results last week, said fully half of
the
respondents cited C as their primary programming language.  
Nonetheless, support for C was down from the 2005 survey,
albeit only
by 3 percent. By contrast, the C++ language gained this
year, coming
in at 28 percent, and respondents predicted a 4 percent
increase in
C++ adoption next year.

The survey showed that relatively few engineers - just a few
percent -
use Java. Matlab, LabView and UML are used about as
frequently for
embedded projects, although Java garners more attention
because of its
use in the graphical user interface portion of many systems.

"Almost every language is losing ground to C++,"
said Turley, who
suggested that many design teams have evaluated Java but
found it
lacking in performance and development tools.

Asked about tool selection, 53 percent of embedded engineers
said the
quality of the debugger was their most important criterion
in choosing
a development suite. Only about 13 percent said open-source
content is
an important selection criterion.

When it comes to operating systems, however, open-source
OSes such as
Linux are gaining significant support. Fully 20 percent of
respondents
said they use an open-source OS, with many design teams
relying on a
commercially distributed form of Linux.

Turley said that one reading of the operating system
responses would
suggest Linux is gaining support quickly, since "just
five years ago
the very term 'open source' didn't mean anything."
However, other
survey questions showed that a declining number of
respondents,
compared with the 2005 survey, are considering Linux,
prompting Turley
to conclude that "that the charm of Linux has
cooled."

Management must take its share of the blame for the software
situation. "Often we are in an overconstrained
situation. We have too
many features to deliver in too short a time frame,"
said Fowler at
his "Fantastic Failures" session. "The
problem is, adding features
requires lots of regression testing. The thing to do is ask
whether
the feature can be saved for the next upgrade - [otherwise]
you are
just setting yourself up for failure.

"We as engineers need to come up with persuasive ways
to warn
management" by relating stories of past failures or
the implications
of long feature lists and tight budgets and schedules, he
added.

Tired engineers were a factor in several aerospace disasters
in which
programmers worked 60- to 80-hour weeks in the months before
a launch,
Ganssle said.

Skimpy budgeting is another factor in failures, as seen most
clearly
in civil-engineering disasters. In 1940, officials found a
way to
build the Tacoma Narrows Bridge for half an initial estimate
and did
so, but the bridge famously collapsed in high winds after
just four
months in service. Likewise, the MGM Grand Hotel in Las
Vegas saved
$200,000 by not using sprinklers but paid out more than $200
million
in court and rebuilding costs after a disastrous fire,
Ganssle said.

In the confines of a software project, "spending
$2,000 on tools might
save you $100,000 in programming effort," said Stewart
of Embedded
Research Solutions.

[...]

- Additional reporting by Richard Goering and David Lammers

Copyright © 2005 CMP Media LLC



_________________________________
LayerOne 2006 : Pasadena Hilton : Pasadena, CA
Infomation Security and Technology Conference
http://layerone.info
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )