Software Engineering: Ariane 5

Ariane 5 was designed by the European Space Agency (ESA) as a replacement for the successful Ariane 4 launcher. The intention was to create a reliable, high capacity, launch vehicle for ESA that could be used to support their contribution to the International Space Station as well as a range of other commercial and scientific launches.

On June 4, 1996, the US$500 million space craft was launched for the first time:

"The countdown, which also comprises the filling of the core stage, went smoothly until H0-7 minutes when the launch was put on hold since the visibility criteria were not met at the opening of the launch window (08h35 local time). Visibility conditions improved as forecast and the launch was initiated at H0 = 09h 33mn 59s local time (=12h 33mn 59s UT). Ignition of the Vulcain engine and the two solid boosters was nominal, as was lift-off. The vehicle performed a nominal flight until approximately H0 + 37 seconds. Shortly after that time, it suddenly veered off its flight path, broke up, and exploded. A preliminary investigation of flight data showed: "
  1. "nominal behaviour of the launcher up to H0 + 36 seconds; "
  2. "failure of the back-up Inertial Reference System followed immediately by failure of the active Inertial Reference System; "
  3. "swivelling into the extreme position of the nozzles of the two solid boosters and, slightly later, of the Vulcain engine, causing the launcher to veer abruptly; "
  4. "self-destruction of the launcher correctly triggered by rupture of the links between the solid boosters and the core stage. "

"The origin of the failure was thus rapidly narrowed down to the flight control system and more particularly to the Inertial Reference Systems, which obviously ceased to function almost simultaneously at around H0 + 36.7 seconds. "

ARIANE 5 Flight 501 Failure Report by the Inquiry Board

Analysis of the failure

Following the failure, a board of inquiry was established to determine the cause. Despite the extent of the explosion, investigators were able to locate and extract information on the memory contents of the internal computer systems and in conjunction with later tests, they were able to reproducibly determine the cause of the failure.

Sadly, the primary cause was found to be a piece of software which had been retained from the previous launchers systems and which was not required during the flight of Ariane 5. The software was used in the Inertial Reference System (SRI) to calculate the attitude of the launcher. In Ariane 4, this software was allowed to continue functioning during the first 50 seconds of flight as it could otherwise delay launching if the countdown was halted for any other reason, this was not necessary for Ariane 5. As well, the software contained implicit assumptions about the parameters, in particular the horizontal velocity that were safe for Ariane 4 but not Ariane 5.

The failure occured because the horizontal velocity exceeded the maximum value for a 16 bit unsigned integer when it was converted from it's signed 64 bit representation. This failure generated an exception in the code which was not caught and thus propagated up through the processor and ultimately caused the SRI to fail. The failure triggered the automatic fail-over to the backup SRI which had already failed for the same reason. This combined failure was then communicated to the main computer responsible for controlling the jets of the rocket, however, this information was misinterpreted as valid commands. As a result of the invalid commands, the engine nozzles were swung to an extreme position and the launcher was destroyed shortly afterwards.

The failure was thus entirely due to a single line of code.

"An underlying theme in the development of Ariane 5 is the bias towards the mitigation of random failure. The supplier of the SRI was only following the specification given to it, which stipulated that in the event of any detected exception the processor was to be stopped. The exception which occurred was not due to random failure but a design error. The exception was detected, but inappropriately handled because the view had been taken that software should be considered correct until it is shown to be at fault. The Board has reason to believe that this view is also accepted in other areas of Ariane 5 software design. The Board is in favour of the opposite view, that software should be assumed to be faulty until applying the currently accepted best practice methods can demonstrate that it is correct. (emphasis added)"
ARIANE 5 Flight 501 Failure Report by the Inquiry Board

While the Inertial Reference System could not be tested by "Black Box" methods directly without an actual flight, it would have been possible to simulate the launch by supplying the processor with appropriate input. This would have detected the fault, but this type of test was not performed. Instead, the SRI was replaced by a simulator which fed correct values to the main onboard computer during the pre-flight testing.

As well, the requirements for the Ariane 5 system were significantly different to that of the Ariane 4 system, despite their overall similarity. Changed requirements were, however not taken into account when reusing the code and any associated assumptions were not examined to ensure their validity.

As a result of the investigation, fourteen recommendations were made by the Board, five of which related to test procedures and software validation. This failure was entirely preventable by reasonable testing.

Lessons for software engineering

This particular failure has been used as a justification for the "design by contract" methodology, and it is certainly true that better documentation of assumptions and parameters might have prevented the failure. However, it was also a failure to test the SRI at any stage in the development of Ariane 5 that really caused the failure. Simulations conducted after the failure were able to easily reproduce the problem and identify the cause, these should have been conducted prior to launch.

The failure suggests the following lessons for other software systems:

References

European Space Agency home page for Ariane 5
http://www.esa.int/export/esaLA/ASEVLU0TCNC_index_0.html
Notice the careful phrasing "[Ariane 5's] first successful launch took place on 30 October 1997"

Put it in the contract: The lessons of Ariane
http://www.irisa.fr/pampa/EPEE/Ariane5.html

ARIANE 5 Flight 501 Failure Report by the Inquiry Board
http://java.sun.com/people/jag/Ariane5.html