Sunday, March 14, 2010

the hazards of cumulative programmers

Ever wonder how long an error or omission in a piece of code can go undetected?

I found one in a piece of FORTRAN code last week that had remained undetected for around 23 years. It was based around the assumption that a particular method of processing for a slab of steel was the same every time. But conditions do change and the result of a small change in incoming dimensions coupled with the use of a particular processing selection produced an unfavourable result in the final product. As usual, while it would be justifiable for me to assume that this was one programmer's shortcoming, the real story is more mundane.

Programmer A assumes that the product will always be processed in the same way and determines critical product parameters based on where he thinks the process starts and ends.

Programmers B, C, D, E, F and G work on the same 25000 lines of code over the next 23 years, making unrelated changes that all slightly alter the conditions that programmer A's coding solution held valid under, until such time that, in combination, they result in a truly unfavourable, fuel tanker, freight train collision cum explosion of a bad product outcome.

Scratch many tonnes of finished product and suddenly, programmer H, who has been left to cook with such spaghetti, is caught in the headlights of a large truck being driven by the angry production manager.

So programmer H has to sort it all out over the following week until the "aha" moment arrives. One line of missing code is all it took to wreak havoc. Inserting it is simple but satisfying and all but guarantees some future proofing against a repeat of this carnage.

Imagine an error like that bringing down a bird at the end of NASA's shuttle program. Unthinkable but possible. How many probes have now been lost due to such errors and it aint a short trip to Mars to reload either.

No comments:

Post a Comment