Computer With Wings -- Boeing's Ultracomplex 777 Flies Into Debate Over Technology Hazards
(Copyright, 1995, The Seattle Times Co.)
Nothing defines The Boeing Co. so much as its uncanny ability to shepherd risks.
Throughout its history Boeing has gambled big - and triumphed - by introducing new jetliners that pushed existing technology to its limits, then delivering in the face of high-pressure deadlines.
Boeing has done it again with the new 777. After more than five years and $5 billion in development, the 777 is to carry its first paying passengers this week, precisely on schedule.
For all the thousands of aircraft it has built since 1916, Boeing has never made anything like the 777.
The 777 is Boeing's first full-fledged "fly-by-wire" aircraft, in industry jargon. The "wire" is electrical circuitry. A 777 pilot will not so much fly the airplane as tend a sprawling, vastly complex network of computers flying the airplane.
The 777's computerized controls are so advanced that they fall into an area of rising debate among technology professionals and academics: How do you assure safety when ultracomplex computer systems are used in potentially hazardous applications?
The control system proved so difficult to assemble and test that Boeing was forced to abandon a conservative development strategy as it hurried to meet its delivery deadline. After numerous delays and scores of changes, the final version of the 777's key flight-control software was delivered in April this year - 11 months after Boeing had originally promised it, only four weeks before the first airplane was delivered to United Airlines.
The combination of an ultracomplex technology and development that constantly ran behind schedule has left some software experts, pilots, aviation-safety consultants and even 777-program insiders wondering if Boeing left itself enough time to adequately test the arcane software code that drives the 777's computers.
"It's a question of extensiveness," says Prof. Bev Littlewood, director for the Centre for Software Reliability at City University in London, and the author of several research papers on software safety. "Even a small change in software can have catastrophic effects."
Boeing officials declined to be interviewed for this story.
However, Anthony Broderick, Federal Aviation Administration associate administrator, said the 777 came through its development period as no other jetliner ever has.
"The testing has shown the 777 to be the finest, most reliable, safest airplane ever delivered," Broderick said.
Jim Treacy, the FAA's national expert on integrating airplane computer systems, and Mike DeWalt, the agency's top software expert, acknowledge that the agency allowed major changes to the 777's development plan so the airplane would stay on schedule. But they say the changes were reasonable and that they largely followed procedural rules hashed out years ago.
Treacy and DeWalt believe that any significant problems with the 777 were revealed and fixed in what they characterize as a routine development, testing and certification process.
Even so, Treacy termed "unfortunate" the degree to which jet-transport builders are allowed to plant "a stake in the ground" representing the projected FAA certification date for a new model, usually a few weeks before delivery.
"It's become: Everything else can move but the certification date," he said. "There's very much a huge pressure on the part of the applicant that says you gotta meet that date."
Simple control
Remove the bells and whistles and a modern jetliner is a straightforward apparatus: Two or more engines push the device through the air, causing a physical phenomenon called lift to affect the wings.
Movable surfaces on the wings (flaps, slats and spoilers) increase or decrease lift to get the jet off the ground, or to slow it for landing.
Ailerons on the outer portion of each wing move in opposing directions to cause the aircraft to roll.
Elevators, on the horizontal tail section, flip up or down, tilting the nose up or down.
The rudder, in the vertical part of the tail, swings left or right, turning the nose.
In traditional aircraft, the various control surfaces are linked to the cockpit with a system of mechanical pulleys and cables. A pilot controls them by turning a wheel, pushing or pulling the yoke the wheel is attached to, or by pressing foot pedals.
Boeing sold thousands of such conventionally engineered commercial jetliners worldwide, and came to dominate the market, chased by U.S. rivals McDonnell Douglas and Lockheed Corp. By 1970, the big-three jet makers had mastered mechanical engineering as it applied to jetliners.
Jetliner technology began changing in the mid-1970s with the emergence of an upstart European consortium called Airbus Industrie.
Before launching Airbus, French, British and German aerospace companies had collaborated on the angular, hooked-nosed Concorde Super Sonic Transport. The SST featured the first use of computer signals to control a critical system - engine air-intake valves - on a commercial aircraft. The SST showed that computers could be as reliable as cables in day-to-day airline operations.
Electronics was advancing at the same time. By the late 1970s, tiny integrated circuits, the processing chips at the heart of every computer, became widely available. Airplane designers began exploring the possibility of replacing traditional steel control cables with wires carrying computer-generated commands.
In 1978, Boeing decided to use digital technology to operate some of the control surfaces on two new aircraft, the 767 and 757. Electric signals triggered hydraulic devices in the wings to adjust the panels. It was a bold move, made cautiously. The computers controlled only secondary, nonessential controls - slats, flaps and spoilers. Boeing retained mechanical linkages for the primary controls.
The 757 went a bit further than the 767: It was the first commercial jet to use digital controls to run and monitor the engines, too.
About the same time, Airbus was developing its second model, the A310 twinjet, which used essentially the same degree of computer control as the two Boeing jets.
The early digital systems proved to be just as reliable as mechanical systems. And they had advantages: Electrical circuits saved weight compared with mechanical systems, and for airlines that meant profits - less fuel, more payload or both. Electrical circuits also were easier to maintain and update.
Airbus delivered the first A310 to Lufthansa Airlines in April 1983. Then five months later, it leapfrogged ahead in jetliner technology by announcing it would build aviation's first all-digital commercial aircraft, the A320 twinjet.
It was a calculated decision. Airbus was a distant third in market share behind Boeing and McDonnell Douglas, and needed to do something dramatic.
The fly-by-wire A320 would be so much lighter, faster, more efficient and more reliable that it couldn't help but sell like hot cakes, Airbus reasoned. It would blaze a digital trail the way the Boeing's 707 trailblazed jet-powered commercial flight in 1960.
An ambiguous realm
A cable linkage between a pilot and a flight-control surface is a simple, direct thing, operating according to visible, mechanical forces. You can see how it works. If it doesn't work, it is not difficult to track down the problem and fix it. The complexity of mechanical systems is limited by nature; you can bend metal only so many ways in a confined space.
Computers and software are something else. A processing chip embedded with a maze of microscopic switches performs discrete mathematical operations according to detailed, coded instructions. An input of data is received, one calculation feeds data to another calculation, which feeds its results to another, which branches to several more.
Tiny bursts of electricity cascade at light speed through a plexus of computer codes and switches. In the end, something happens. A specific electronic signal is generated. A pattern of dots appears on a video screen. A hinged panel moves on an airplane wing.
The complexity of software is virtually unlimited. If something goes wrong, figuring out why, not to mention fixing the problem, is not simple at all.
In her new book, "Safeware, System Safety and Computers," Prof. Nancy Leveson, University of Washington professor of computer science and engineering, argues that computer systems are fast becoming "intellectually unmanageable."
As complexity rises, "the number of states starts to increase so quickly, you start to begin to test a smaller and smaller fraction of the total possible," said Leveson. "As systems get more complex, it gets harder and harder to predict the interactions and the different kinds of things that can happen."
`Safety critical'
On the 777 more than 4 million lines of coded instructions drive 150 computers that must work in harmony to monitor and adjust for the airplane's constantly changing position in the air. A minor miscalculation in one program could corrupt other programs. The results could range from a squawk, a harmless error message that comes and goes on the cockpit displays, to loss of control. A panel might fail to move, or move at the wrong time, or in the wrong direction, or at the wrong rate, or to the wrong position.
Aircraft engineers have had two answers to the myriad ways in which things might go wrong. First, they isolated "safety critical" computer programs so that an error in one had limited effects. Second, they provided numerous redundant backup systems.
That's how Airbus designed the A320. If the A320's computer system were a building, it would be a sprawling warehouse with pairs of posts every few feet holding up the roof. It would have many, many more pairs of posts than necessary to keep the structure sound. If a single post got knocked out, its twin could bear the load. If a pair of posts, or even several pairs were lost, the structure would remain sound.
Each computer on board the A320 contained a processing chip with memory, its own low-voltage power supply and input/output circuitry. Each was designed to shut down harmlessly in cases of a serious software or hardware failure.
Because each computer "black box" was physically separated from every other black box, and because the impact of any particular box failing was limited, the A320's architecture was deemed to be "fail-safe."
The problem with this approach is that it makes for vast duplication, with many machines and thick bundles of wire running up and down the airplane. That translates into weight, the enemy of airline profits.
The market heats up
Though its new A320 was a hit with airlines, the Airbus partners wanted a bigger family of airplanes to compete with Boeing, which offered the 150-seat 737, the 194-seat 757, the 218-seat 767 and the 420-seat 747.
Airbus had no answer for the hot-selling 747 jumbo jet. It decided to stake out new territory.
Guessing that overcrowding at airports in major cities would eventually lead to more point-to-point flying between smaller cities, Airbus designed a 295-seat jet available in a two-engine version (the A330) for regional flying, or equipped with four engines (the A340) for long routes.
The new planes would feature the same fly-by-wire system as the A320, with cockpits so similar that a pilot could qualify to fly all three with minimum cross-training.
Meanwhile, Boeing was intently exploring fly-by-wire systems itself. A Boeing engineer named John Shaw in 1987 was championing a bold advance, a new computer architecture, with Aeronautical Radio Inc. (ARINC), the industry association that establishes standards for avionics.
Shaw called for abandoning the physical separation of computers, and consolidating core processing, memory, input/output, the power supply and error-tracking functions in a central cabinet.
A fly-by-wire jet would still need hundreds of separate software programs in this architecture, but the processing and memory functions would take place in a central "brain."
If current jetliner architecture resembled an overly buttressed warehouse, Shaw's architecture was a tent with a very sturdy central pole.
Shaw theorized such a system would bring about enormous weight savings and phenomenal gains in efficiency and reliability.
Shaw's design required ultrafast, ultrapowerful integrated circuits -chips that weren't expected to be available until the early 1990s. Also needed was a new "databus," the electronic pathway on which computers trade data.
The aircraft-databus standard at that time was called ARINC 429. Every computer on an airplane was hard-wired to every other computer - like having separate, one-way phone lines wired directly between your home and every other home you wanted to communicate with.
Shaw proposed moving to a new databus standard that Boeing had been developing called ARINC 629.
Compared to ARINC 429's one-lane country road, ARINC 629 was a multilane superhighway. Instead of computer-to-computer wiring, ARINC 629 would have a twisted pair of wires running the length of the aircraft. Every computer, the big central processing computer and certain specialized processors scattered around the airplane, would be connected to the databus. Computers could transmit data at any time. Data could zip around the system between computers and to the central processing unit as needed.
Shaw persuaded ARINC to make ARINC 629 and a "shared-resources architecture" the next industry standard. Final specifications were to be ready in late 1991.
Boeing wasn't waiting.
Launching the 777
Airbus's A330/A340 collected 180 orders in the three years after it was launched.
In October 1990, Boeing announced that it would develop the 350-seat 777 twinjet, with a firm order for 34 jets from United Airlines.
It would be Boeing's first all fly-by-wire model.
It would use the ARINC 629 databus, still undeveloped, and it would use a shared-resources central computer called AIMS, for Airplane Information Management System. AIMS would be the brains for most of the 777 computer systems.
Everything about the 777, from cabin entertainment systems, to cockpit navigational systems, to its huge new fuel-sipping engines would surpass Airbus' newest models.
Boeing promised customers that the 777 would be so reliable that the company would persuade the FAA to certify the twinjet for long ocean routes from the day it entered commercial service.
FAA rules normally require twinjets to fly shorter routes for years to prove they are reliable enough to fly long distances away from a possible emergency landing site. (Losing power from one engine on a twinjet is considered an emergency.)
Boeing argued that years of successful 767 flights across the Atlantic and Pacific demonstrated that the FAA twinjet rules were obsolete. The company promised the 777 would be so carefully designed and manufactured and so thoroughly tested that there should be no question about its immediate high reliability.
Not since the launch of the 747 jumbo-jet program in the late 1960s had Boeing gone this far out on a limb.
The 747 had involved the parallel development of a novel fuselage, a new wing and new engines. However, the mechanical flight-control systems were derived from previous Boeing models.
In industry, the parallel development of two or more novel systems is considered risky because you're dealing with tiers of unknowns. There is no network of savvy parts suppliers and the accompanying experience base. There is no operating history for any of the systems, nor any knowledge about how one system will respond to another in real use.
The 747 had proved a case in point. Engine development lagged the rest of the airplane's development, and Boeing ended up paying millions of dollars in customer penalties when engines on the first jumbo jets failed to perform as promised.
Like the 747, the 777 involved the mating of an all-new airplane with new engines. But the 777 would go further with fly-by-wire, the new ARINC 629 databus and shared-resources central computer.
Boeing committed to an ambitious development schedule: It would fly the 777 in mid-1994. After a year of test flights, the plane would enter service in June 1995. (For comparison, the SST flew as a test plane in 1969, but didn't carry its first paying passenger until 1976.)
Boeing predicted it would deliver a virtual glitch-free digital airplane.
The nature of bugs
From one perspective, all software is perfect.
Software is an intricate web of logic through which only the right data may pass.
In that sense, software always does its job.
But code writers, being human, can make clerical errors which spawn bugs - unexpected results or failures. In the 1960s, a Mariner spacecraft heading for Venus was lost because its computer code contained a semicolon instead of a comma. Today, most coding errors are caught and corrected by sophisticated testing tools.
Most often, bugs persist because programmers overlook unusual conditions that the software code might run across. The more complex the code, the greater the opportunity for programming oversights.
As Boeing stepped up development of the 777, the digital world was advancing. Progressively smaller, more powerful processing chips made vast calculations possible in a blink of an eye. This had touched off a software explosion.
Computers could handle more complicated code, so more complicated code was being written, sometimes before code writers fully grasped what they were asking the software to do. Rare conditions - situations programmers failed to anticipate - became the leading cause of bugs turning up in the late stages of development programs, said UW Prof. Leveson.
"Nature imposed a discipline on hardware systems," she said. "But it just wasn't there for software. We were dealing with man's ability to discipline himself, and we've never been any good at that."
Engineers at Saab Military Aircraft learned this lesson the hard way. On Aug. 8, 1993, test pilot Lars Radestrom was demonstrating the second production model of the new fly-by-wire Gripen fighter at the Stockholm Water Festival. Entering a turn, Radestrom noted the computer banked the jet 10 degrees more than he asked for. When Radestrom tried to compensate, the jet began pitching up and down. The more the pilot tried to smooth out flight, the wilder the pitching became.
After 6.2 seconds, Radestrom bailed out and the jet crashed harmlessly on a nearby island. Despite a $3.2 billion development effort, including four years of flight testing and debugging, Saab programmers had failed to anticipate that a pilot might try large, rapid movements of the control stick. When that happened, the software did what it was programmed to do - but not what Radestrom expected: It amplified the command.
The job of writing software for the 777 is nothing if not complex. An FAA test pilot describes the 777 as the most complicated machine ever built: "No question, it's more complex than the Space Shuttle by a factor of two."
Keeping a 500,000-pound airplane safely aloft involves dozens of simultaneous, overlapping tasks executed by a swarm of electrical bursts flowing through a maze of microcircuits in the proper sequence, all at blinding speed.
A bug could occur if just one of the thousands of calculations occurring each millisecond unfolded out of sequence, due to a rare condition - a power surge, for example, or an unusual gust of wind, or an unorthodox pilot maneuver.
Interrupting software's brittle sequencing matrix is akin to snapping a single strand of a frozen spider wed; the results could range from inconsequential to catastrophic.
Trying to test a very complex digital system in advance, to catch and eliminate bugs, not to mention trying to gauge how safe the system will be, has proved daunting, if not impossible.
"Each component can be fairly small and well understood, but when you put them all together, things happen that are very difficult for us to predict, and, therefore, to even understand what we need to test for," Leveson said.
Software bugs have triggered large-scale outtages of telephone service and been blamed for throwing a Patriot missile off course during the Gulf War, allowing an Iraqi Scud missile to strike a U.S. barracks and kill 28 soldiers. Denver's new airport opened nine months late because bugs in its computer-operated baggage system compounded to the point where the system had to be completely replaced.
Iron birds, black labels
How Boeing viewed these emerging software issues in planning the 777 was not a matter of public record or discussion. But the company promised that the 777 would be the most reliable, most thoroughly wrung-out airplane ever built.
Boeing and the FAA negotiated a number of demanding conditions required for the new jet to meet the certification timetable.
Boeing agreed to build a huge Systems Integration Laboratory (SIL): the airplane's entire computer system assembled inside a building and connected to a full-sized, working set of 777 wing and tail flight controls, called the Iron Bird. Boeing set a goal of having the Iron Bird make 3,000 lab "flights" before the start of actual flight tests in June 1994.
Boeing agreed that the actual flight tests would be conducted with a "black-label" fly-by-wire system, an industry term signifying that hardware and software in the system was finished and ready for production, and that no more changes would be made.
A system still in development and testing is called "red label." "Black-label freeze" is time at which all systems must be complete so that conclusive tests can be run to see how everything works together.
The original development and certification plan for the 777 assumed Boeing would open the lab in February 1993, fully assemble the 777 fly-by-wire system in the SIL within a few months, and largely debug the hardware and software with the SIL and Iron Bird through 1993 and the first half of 1994. Then flight tests would get under way with a black-label system.
By October 1994, Boeing anticipated, it would be ready to dedicate one test jet to fly 1,000 demonstration flights proving the 777 was reliable enough to carry passengers across oceans.
Red labeled
The Systems Integration Laboratory opened on schedule. Boeing asked suppliers to have hardware and software 80 percent ready within three months and 100 percent complete by June 1993, in time for the start of the 3,000 lab flights on the Iron Bird.
Within a few weeks, it became clear there was no way the complete fly-by-wire system could be assembled and fully integrated in less than a year, much less within a few months.
In a March 1993, barely a month into the development schedule, John Miller, Boeing's 777 division airworthiness chief engineer, wrote a letter to the FAA saying red-label computers would have to be used for flight tests still more than a year away.
This was an important change in the original plan. It meant time-consuming software-verification tests would be pushed into the closing stages of the project, with delivery deadlines looming.
The FAA ruled Boeing's rationale for allowing flight-testing to proceed with red-label systems was sound.
According to FAA officials and program workers who have spoken on condition they not be identified, Boeing officials took the position that the 777's design was so robust, with so many redundant layers of backup systems, that changing the development plan wouldn't matter.
A question then, and now, is whether changes in the development plan were a good idea, given the 777's novelty and complexity.
"This is the kind of thing where you want to have the least amount of risk by conforming to the program as it was designed," said Hal Sprogis, a retired airline captain and safety consultant who researches safety issues for pilot groups.
Sprogis, who still flies as an engineer on older 747s, opposed (as many pilots did) the idea of instantly approving a new twinjet for oceanic crossings. "When they manipulate it as they go, it gives you an uncertain feeling. You begin to mistrust what's happening."
AIMS and ACEs
No other jetliner has anything akin to the 777's AIMS computer, which serves as a central brain for most of the jet's 150 other computers.
Under John Shaw's original proposal, an ideal "shared resources" architecture would have all systems using the central computer brain.
But Boeing engineers weren't prepared to go quite that far with the 777.
They purposely separated a safety-critical system called the Primary Flight Computer (PFC) from AIMS. This, they argued, made the system more robust, while giving it a measure of conservatism.
Built by GEC-Avionics of London, the PFC works in concert with the AIMS-driven Flight Management Systems (FMS) computer and the autopilot computer to fly the airplane most of the time.
Typically a pilot taxis to the runway, takes off, and then immediately turns the airplane over to the FMS and autopilot.
If AIMS is the 777's brain, the Flight Management System is its intellect. The FMS constantly receives and analyzes information from sensors around the airplane, then tells the autopilot how to adjust the flight-control surfaces to keep on the proper course.
The PFC calculates the precise corresponding movements the wing and tail panels must make to fly at top efficiency. The PFC's calculations are then transmitted to a bank of computers, called the Actuator Control Electronics (ACEs), which issue the commands that move the flight-control surfaces.
If a serious bug occurs in the FMS, autopilot or PFCs, the pilot can switch them off and grab the wheel. Moving the cockpit controls would send signals directly to the ACE computers commanding the flight surfaces. Thus, the plane is flown "by wire," rather than with cable-linked control surfaces.
The 777's design calls for three PFCs always running - and in effect voting - on commands. Should a discrepancy arise, the majority rules. If one PFC goes down, there are two others to back it up. If all three PFC's crash, the pilot can switch them off.
Without electricity none of this works, so redundancy permeates the power supply, as well. There are backup power-supplies, backup generators, backup batteries. If all those somehow failed, a backup of last resort, a small device called a Ram Air Turbine, essentially a small windmill, would be deployed from the right, rear of the fuselage and generate enough electricity to keep minimum cockpit controls alive.
In the unlikely event of a complete electrical system shutdown, cables from the cockpit to selected spoilers and the horizontal tail section allow the pilot to glide straight and level until the electrical system is restarted.
A balky bus
Before the 777's varied computer systems could talk to each other, ARINC 629 had to be up and running.
But the new databus proved to be a major early stumbling block.
To ship or retrieve information via the databus, computers needed to be equipped with two high-powered, pre-programmed processing chips and a special coupler. No one had ever mass-produced these devices before.
Early versions of one chip overheated and stopped transmitting data. The other chip worked most of the time - if it came from one supplier - but only sporadically if supplied by an alternate contractor. The coupler was several months late.
"The bus is the basic skeleton of how the airplane systems communicate. It's the one thing that needed to be rock solid from the beginning," said a source close to the 777's development. "It took about a year to work out the wrinkles, which is about what you'd expect." Boeing, however, had projected the databus would come together smoothly in a matter of months in early 1993.
AIMS, too, stumbled early.
AIMS would consolidate most of the 777's digital processing power in two suitcase-sized central computers. In doing so, AIMS departed from the established fail-safe architecture of physically separating different computer functions.
Honeywell had to prove that a bug in one processing function would not corrupt other functions, and, conversely, that bugs in one program could be fixed without affecting other programs.
To do this, Honeywell turned to special software programming called "partitioning," which amounts to a strict, frenetic scheduling of time on the central processing modules. No one had ever attempted to partition so many computer programs on an airplane before.
Partitioning required the use of 18 pre-programmed circuits (called application-specific integrated circuits, or ASIC chips) to ensure that only one program at a time could access the central computer's powerful processing modules.
"Right now you get it for 50 milliseconds or 200 milliseconds, and as soon as you're done, he gets it," is how Don Morrow, Honeywell's 777 program manager, described the system at work.
Honeywell's challenge was to create circuits sophisticated enough to distinguish and properly sequence a high volume of data from numerous sources all at once with great accuracy. "We had to prove it was deterministic and it happened that way every time."
The ASIC chips were crucial, and ended up being more than a year late from a chip supplier.
Morrow said Honeywell "really underestimated what it would take to develop" the 18 ASIC chips. "As a result we went downstream a little farther than we wanted."
Language woes
June 1993, once projected as the month when all software and hardware would be integrated in the SIL and ready to start flying the Iron Bird, came and went.
ARINC 629 and AIMS were falling further behind schedule by the day. This compounded delays with the suppliers of the dozens of computers that would feed into AIMS via the new databus.
Troubles cropped up with the Primary Flight Computers. Boeing initially assigned three separate teams at GEC-Avionics to write codes that would control the wing and tail panels. This was standard practice. A software bug in one of the PFCs would be unlikely to turn up in the other two, which would outvote the third.
According to a source close to GEC-Avionics, the software teams kept asking for clarifications of Boeing's specifications for the PFC code, slowing development, until Boeing finally chose to have one team write a single program for all three PFCs.
(A British magazine, Computer Weekly, reported last week that some leading European software-safety experts have called for a detailed review of the single program Boeing ended up with before the 777 begins flying passengers.)
Once the PFC code was written, Boeing could not keep all three PFCs running at the same time without making the ARINC 629 databus crash. While only one PFC is needed to fly the plane, not having the other two available wiped out the safety margin.
Programmers trying to smooth out the PFC's code problems had plenty of avenues to explore.
First of all, they were using a language called Ada to write the PFC code. Ada has long been used in military applications on mainframe computers and was chosen by Boeing as the standard language for all of the 777's software. The 777 represented the first widespread use of Ada, a relatively cumbersome language, on computers driven by microprocessors.
In using Ada, programmers had to write 10 lines of code to do something that a newer, more elegant language, like C # , might do in three or four lines. More lines of code meant more opportunity for errors.
There were other hangups.
Ada and C # are high-level programming languages that use recognizable phrases, such as "jump to test 1." Code writers use a translating tool called a "compiler" to translate high-level commands into machine code, the schematic of ones and zeros that guide electricity through the maze of integrated circuits.
Contractors writing compiler code had to translate Ada commands into machine code for two completely different processing chips (one made by Intel, the other by Motorola) OK'd by Boeing for use on the 777.
Ada programmers complained that early versions of the compiler code were slower than they expected.
"You would say `x equals y' and expect (the compiler) to translate that into maybe three machine steps," a 777 systems supplier said. "It turns out, with all the considerable options of available data, it actually translates into 20 steps."
By the fall of 1993 things were backing up steadily.
"The policy became don't change anything, proceed on schedule, explain away everything we can, then do a product enhancement after delivery," the supplier said. "As bugs showed up, we were told to expand the acceptable criteria to allow for the problem, rather than to fix the problem."
Treacy, the FAA's avionics expert, acknowledged that compromises were accepted, but said that's not unusual for any big software-development project. The 777's safety margin, as far as the FAA was concerned, remained intact, he said.
"When you've decided that there are two ways to fix a particular problem, an elegant way and a Band-Aid way, and one's quicker than the other, you take the one that's quickest," Treacy said, "especially if it is not a real overriding safety thing."
First flight
By the end of 1993, it became clear that if test flights were to begin on schedule six months later, in June 1994, it would have to be with something less than a fully functional fly-by-wire system.
As he was about to retire at the end of 1993, Boeing senior executive Dean Thornton told a Seattle Post-Intelligencer reporter: "This airplane is one big computer. I'm not saying we're going to fall off a cliff, but if I'm going to stay awake nights, it will be over the software, not the hardware."
By the spring of 1994, GEC Avionics still could not get all three PFCs to boot up simultaneously for an extended period. On May 31 a new load of the PFC software arrived from London and was installed in the SIL and on the No. 1 777.
On June 7, with the aviation community eagerly waiting word from Boeing about the 777's first flight, another new PFC software load arrived to replace the one Boeing had tried out in the lab and on the airplane a week earlier.
On June 12, a cloudy, blustery day, the 777 lifted off from Everett's Paine Field before a crowd of company officials, dignitaries and reporters. Cheers went up. Upon returning from a nearly four-hour flight, chief 777 test pilot John Cashman declared: "Best first flight ever!"
Among the people who got a more frank description from Boeing was the late Berk Greene, then the FAA's chief 777 test pilot. Greene, who communicated daily with the aviation community on computer bulletin boards, died last October.
In a message to a colleague, Green described the FAA's view of the 777 program to that point:
"The FAA has watched from a distance all the testing being done in this lab as the vendors began to deliver software, and Boeing got various elements on line. It hasn't been a very pretty spring, as vendors were late, some parts had lots of problems getting on line.
"The later they got, the worse things looked, with Boeing even taking measures like off-loading some integration testing in the labs (because they were behind) and doing these test sequences in the airplanes still on the factory floor. Lots of horror stories there as well, things like mismatches apparent between hydraulic actuator/surface models on the airplane and what was created in the lab, resulting in weird vibratory modes that weren't evident in the lab.
"Planned first flight date began to slip because of all this, and in the end, slipped about two weeks. Finally, enough things came together so that risk was limited enough to take a chance on the flight. The amazing part: NOTHING NEW SHOWED UP on the flight.
"And that's why John Cashman (the Boeing test pilot) was bubbling `it was beautiful.' There were pages and pages of known bugs and nonfunctionality, with workarounds going into this flight, but nothing new was found."
New black-label date
Within a month after the 777's first flight, new bugs materialized.
GEC had programmed the Primary Flight Computers to confirm proper operation of the flight surfaces by monitoring "feedback" signals; if the fed-back signal from, say, the rudder was significantly different than the commanded signal, the PFC was programmed to shut down the rudder.
On several early 777 flights, the airplane's rudder fed signals back to the PFC indicating it had moved about 2 degrees beyond what the PFC had commanded. The PFC immediately shut down all rudder control.
This occurred because the code writers had not anticipated how much the 777's tail flexed when the rudder moved, a source working on the 777 program said. The problem was fixed by expanding the PFC's range for acceptable feedback signals, the source said.
It was not much surprise to Boeing suppliers when the company advised them in late summer that black-label freeze - the date after which no more hardware or software changes could be made, once projected as June 1994 - was being moved to October 1994. Officially, that was still the date for beginning the 1,000 special demonstration flights to earn the ocean-crossing rating.
In September 1994 Cashman uncovered another rare condition: The big airplane could roll into a steep dive at low speeds.
FAA rules dictate that a jet must be designed so that it does not roll drastically as it slows and nears a stall, the speed at which it loses lift. Minimizing roll during a stall improves a pilot's chances of pulling out of the stall with a simple maneuver.
But Cashman discovered the 777 had a tendency to snap into an acute roll and dive several thousand feet once it began to stall. This happened unexpectedly in a test flight over Southern California. No one was hurt but word spread that several engineers along for the ride lost their lunch.
Revisions to the PFC code controlling the ailerons, flaps and rudder were pursued. While briefing reporters about flight-test glitches, Cashman, noted: "I think, still, the challenge is in the electronics area."
`A long struggle'
On Oct. 6, Boeing held a press briefing at its Everett plant for the rollout of 777 No. 4, the first to be outfitted with a finished cabin interior, and the airplane earmarked to make the 1,000 proving flights.
Reporters asked when those flights would begin. The answer was vague. "We're going to start when we're ready . . . when we get concurrence among FAA and among ourselves," said Jim Metcalfe, Boeing senior engineering test pilot. "There are a few changes we're making on the airplane."
It took another 12 weeks, until Dec. 28, 1994, and the flights began with red-label AIMS and PFC computers. Boeing and its suppliers won't discuss the delay.
John Aplin, GEC-Avionics marketing director, asked about development of the PFC computers, said: "I would characterize it as a long struggle, a lot of hard work, but never any real shocks."
Boeing officials persuaded the FAA that the flights with key systems still red-labeled would nonetheless be a valid demonstration of a "mature" airplane.
Bob Ireland, United Airlines' 777 factory representative, noted that engineers like to push the black-label-freeze deadline as far as they can, because any small change after that can generate mountains of paperwork. The differences between the red-label software used on the proving flights and the black-label systems put into commercial use this week are believed to be insignificant, he said.
"The simple notation of red or black label itself is not relevant," Ireland said.
`Little glitches'
To make 1,000 flights, simulating a year's worth of actual airline usage, 777 No. 4 would have to average 8 1/2 flights a day, seven days a week. (Airlines aren't likely to fly the 777 more than two or three flights a day because of the jet's size and range.)
Some observers questioned whether the proving flights proved much of anything. By necessity, many tests involved turning around and flying while the engines were still warm. During an actual gate turnaround, the 777's engines will cool down entirely, but Boeing persuaded the FAA the difference wouldn't matter.
On Feb. 18, with 460 flights completed, 777 No. 4's right engine seized during an oil change. The plane remained grounded for the next 12 days. When 777 No. 4 finally resumed flying March 3, it had about eight weeks to complete 540 flights, an average of 9.6 flights per day.
The PFCs finally received black-label certification in late March; AIMS certification followed about a month later, just as the 1,000 flights were wrapping up.
In the midst of completing paperwork to black label the PFC, Boeing and GEC scrambled to eliminate a stubborn problem unique to digital controls, called Pilot Induced Oscillations (PIOs), the same phenomenon linked to the crash of the Gripen fighter in Sweden.
Flight tests had shown that under certain conditions the 777's nose and tail would bend up and down 3 times per second, like a jiggling hot dog, though the center of the plane remained steady.
Aviation Week & Space Technology magazine reported that this had occurred on several test flights. In one case, pilots attempted to smooth out the oscillations by moving the control yoke rapidly back and forth. But the digital signals couldn't keep up and the oscillations worsened.
On another flight the oscillations shook Cashman so much that his seat began to slide back and forth, causing him to push and pull on the yoke, the magazine reported. In order to steady himself enough to regain control, Cashman had to brace his foot on the dash. Then in April a guest airline pilot was attempting a touch-and-go landing when the oscillations began. The pilot pulled back sharply on the yoke and fought through the oscillations.
Peter Mellor, a software lecturer and consultant at City University in London, who recently briefed Boeing on the A320's digital system, said the emergence of PIOs so late in the development program, "indicates that the design is still immature. I get a feeling of foreboding that more of these little glitches are waiting to come out."
Corporate ways
At least part of the blame for software bugs, say those who have studied them, lies in the corporate setting, where software-writing discipline faces the pressures of time, money and career advancement.
"There is always a tension," said Bill Curtis, co-founder of Austin, Texas-based TeraQuest, a consulting group which helps corporations sharpen software development.
"If you're late to the marketplace with a product that doesn't have enough functionality or isn't reliable enough, it could kill the company. And who wants to be responsible for killing the company?"
Carnegie Mellon University's Software Engineering Institute has begun a program tracking software-engineering practices at more than 260 leading organizations. It uses a Capability Maturity Model to assess software-engineering discipline.
On a scale of one to five, 75 percent of the participants remain stuck at Level 1, the chaos level, according to the institute. Such companies have scant design processes and no real way of knowing whether they are on the right or wrong track in designing complex software. Only two elite groups rank themselves at Level 5, representing an optimum process; 24 percent are at Levels 2 or 3, the early stages of embracing disciplined practices.
Companies participating in the institute's voluntary program, including Boeing, are allowed to grade themselves, said Curtis, a visiting scientist at the institute.
Bob Jorgenson, spokesman for Boeing Computer Services, the organization that supplies the basic hardware and software used to develop the 777, said Boeing considers itself to be at Level 2 or 3, with respect to avionic systems.
"We certainly aren't real mature in our maturity model, but we're committed to it and are into it as much as anybody in the country," he said.
The Carnegie Mellon model is based on the belief that safe software results from standardized, disciplined practices, such as submitting new designs to extensive peer review and conducting statistical analysis throughout development.
Leveson, of the University of Washington, warns that a false sense of security may arise from focusing "purely on schedule and process, not on quality. I think basing all your decisions on that is just dangerous."
The answer, she argues in her book, "Safeware," is to lessen the degree to which computers and software govern potentially hazardous systems.
"There is no magic solution to any of this," she said. "We're going to have to accept that we may not be able to have all the complexity to do all the things we want."
Mike Hynes, an Oklahoma City-based aviation consultant and member of the International Society of Air Safety Investigators, believes economic factors may be pushing the use of computers in the cockpit on a dangerously accelerating curve.
"First of all, computers are not foolproof enough yet. Secondly, programmers can and do make errors. Most of these errors cannot be tested out. They are only found later on when disaster happens," Hynes said.
Said Leveson: "All of these modern aircraft are pushing the technological envelope. The question is how far can they go. And that hasn't been answered yet."
On deadline
Developing a jetliner has always been an exercise in managing imperfection. Parts don't mesh like they do on the drawing board. Systems fail. The art lies in anticipating the most serious failures, then proving to regulators they've been adequately accounted for.
In a rousing April 19 ceremony at Boeing Field, the FAA certified the 777 as safe. In doing so, the agency endorsed Boeing's approach to developing the airplane's fly-by-wire system.
Yet given the understaffed FAA's loose oversight - principally auditing a development plan that seemed constantly under alteration - the certificate mainly means a satisfactory level of paperwork had been achieved.
"I'm in a very embarrassing situation," said Mike DeWalt, the FAA's national software expert. "To say the software is safe, I cannot tell you that. I can tell you the software (development) has followed our procedures."
Although Boeing officials declined to be interviewed for this story, they have professed in numerous industry forums that the 777's highly redundant system design, coupled with great care and diligence during the development and testing process, has produced a safe airplane.
------------------------------------------------------------------ A FEW KEY TERMS IN UNDERSTANDING THE BOEING 777'S SOPHISTICATED SYSTEM OF COMPUTERIZED FLIGHT CONTROL. ------------------------------------------------------------------
Fly-by-wire -- Using electrical signals rather than mechanical linkages to move airplane-control surfaces such as aelirons and the rudder. The signals may originate with a computer or with a pilot in the cockpit.
AIMS -- Aircraft Information Management System, the central computer that manages data exchange among more than 150 processors aboard a 777.
ARINC 629 -- The 777's advanced digital databus, the electronic "highway" on which computers exchange data. ARINC 629 enables two-way simultanous communication among many computers connected to the databus. An earlier databus standard, ARINC 429, enabled only one-way, point-to-point data transfer from one computer to another.
Flight Management System -- The FMS is a computer system the pilot programs with data such as destination, course and altitude and which then receives information from sensors in the airplane about the craft's position, heading, attitude, speed etc. The FMS commands the autopilot to make control adjustments that can fly the airplane from shortly after takeoff.
Primary Flight Computer -- On the 777, the PFC calculates the precise adjustments that must be made to control surfaces to achieve the results commanded by the FMS and autopilot.
Actuator Control Electronics -- ACE units are specialized computers that translate digital signals from the PFC or cockpit into mechanical movements. The movements are performed by hydraulic "actuators" mechanically coupled to flight-control surfaces.
Red label -- A designation given a computer system in which software and hardware still are under development. In a complex system, for example, many separate pieces of software that must work together may be developed separately and require integration testing, revision and retesting.
Black-label freeze -- The point at which software and hardware development must be finished to allow testing of a system with all its final parts and software code in place.
----------- FLY-BY-WIRE -----------
Pros
-- Saves weight. Replacing cables and pullies with computers and wires saves weight. Less weight for controls means more cargo/passenger payload. -- More reliable. Fewer moving parts to wear out. Software doesn't wear out. -- More efficient. Computers can be much more precise in adjusting controls and operating engines in optimum configuration. -- Two-man crew. Flight engineer is superfulous; computer now monitors more systems than flight engineer ever could.
Cons
-- Random glitches. Aircraft digital systems have a long history of producing fault signals for which mechanics later can find no cause. -- Interface problems. Pilots can misinterpret computer modes. Conversely, software sometimes does not account for everything a pilot might ask the plane to do.
-- Complexity. It may be impossible to anticipate and test all the ways hardware and software can interact.
-------------------------------------- TIMES AEROSPACE REPORTER BYRON ACOHIDO --------------------------------------
Times aerospace reporter Byron Acohido has reported on the 777 since the airplane was proposed in 1989. This story is based on interviews with Boeing suppliers, customers and competitors, aeronautical engineers, mechanics, airline workers, aviation journalists, Federal Aviation Administration pilots, inspectors and managers, aviation-safety experts, stock analysts and airline financiers. Although Boeing officials declined to be interviewed for this story, comments from past Boeing interviews and press briefings were used where appropriate.
-------------------------------------------------------- COMMERCIAL SERVICE WITH THE BOEING 777 BEGINS THIS WEEK. --------------------------------------------------------
Tomorrow in The Times, Polly Lane reports from London on British Airways' expecations for the new twinjet and the hoopla for Wednesday's inaugural flight by United Airlines.
Wednesday, she reports from her seat aboard the new United jet en route from London to Washington, D.C.