Rubbernecking Software Accidents |
Everybody that drives past an accident on the road cannot help but look at the damage. We all slow down and crane our neck, despite the cumulative effect on the traffic that we were just waiting in. “Rubbernecking,” as it’s called, is human nature. I personally don’t think it has to do with wanting to see injury, death, or destruction, but rather a drive to learn from other’s mistakes. I always mutter to myself, “What happened here?” as I drive by. In the same way that I rubberneck on the road, I love reading post-mortem documents for when things go terribly awry with software. I recently found a copy of the post-mortem document for Microsoft Word 1.0, written December 17, 1989. Software development today does not look like it did back then. Today there are fancy IDEs, computing power folks back then could not even imagine, and a little thing called the internet. But despite the differences over the nearly 35 years since it was published, some things are eerily similar. The project started in 1984 with a single developer but didn’t ship until 1989. It took 55 man years to launch, and there were a total of 12,511 bugs, of which 9,377 were deemed fixable. The reasons for this slip are very relatable to the modern software developer. Management churn, shifting of key personnel to other priorities, a focus on adding more features instead of concentrating on core use-cases, and pernicious performance issues all contributed to the 5 year development cycle. But I think the core issue of the project had to do with schedule management and how developers were incentivized. Here’s are some quotes that have stuck with me: — The methods of scheduling used were fatally flawed. A schedule should be considered a tool used to predict a ship date, it should not be considered a contract by development. Because there was so much pressure to meet the schedule, development got into a mode which Chris Mason refers to as "infinite defects". Developers get credit every time they can check a feature off, so they are more inclined to mark off their current feature and go on even though it really is not done. There was a prevailing attitude of the "testers will find it" when thinking about potential bugs in code being developed. In many cases they did find it, and that is what caused our stabilization phase to grow from the expected 3 months (which is a pretty random number anyway), to 13 months. … The idea that a schedule is God leads to infinite defects, as explained above. Also, the belief that a schedule must be ambitious so that the development team will work hard is severely misguided. — At the beginning of the project developer performance was based on lines of code written, and moving their assigned work to “Done,” keeping to a schedule. But ironically, poor quality software written on time just meant that the testing and QA cycles slipped the schedule even more. Adding more features just compounded the problem. Only when they made a pivot to quality of code and documentation, instead of incentivizing quantity of code, did the project get back on track. If you took the post-mortem, changed some of the numbers and names of technologies, this document would largely still hold up today. Now, I’m not here to pick on Microsoft or aggressive project managers. My point is that there is a rich record of mistakes and failures that people have documented in our industry. My own company has thousands of these documents called COEs (correction of errors) where we endeavor to publish our failures, fix our broken systems and processes, and hopefully, learn these hard lessons once. And just like rubbernecking on the highway, though without the traffic consequences, I subscribe to the COE-watchers email distribution list internally. Not because I enjoy the carnage (though it is amusing at times), but because these documents are so information rich. It’s up to us whether we want to learn lessons by studying the past, or whether we learn these lessons the hard way by making the same mistakes ourselves. Did you find any interesting bits in the Word 1.0 post-mortem? I could probably write several newsletters on just this one document. Support If you're enjoying these emails, I would be honored if you supported me via Patreon. If you can't afford to support me, it's all good. Please continue to enjoy the newsletter, which I plan to make free forever. Patrons get an archive of previous newsletters, early access to videos, and other goodies. If you’re in a position to help, I would really appreciate your support. Share the love If you are enjoying the content of this newsletter, please share it with your network. Much of the content here is exclusive to the email newsletter and will not be featured in my YouTube videos or on my other social media accounts. https://newsletter.alifeengineered.com/general |
113 Cherry St #92768, Seattle, WA 98104-2205 |
Join our email list to learn about the next launch.