Premature Optimization
We’ve all heard, ad nauseum, that, as Donald Knuth opined over 35 years ago, premature optimization is the root of all evil.
Of course, Knuth was talking about prematurely optimizing code for runtime throughput.
The problem is that it’s almost impossible to know what will need to be optimized unless you actually profile the code under real conditions. When you are first writing the code, time spent optimizing performance will likely be wasted, because the code that you optimize will likely not be an actual bottleneck at runtime.
This doesn’t mean that you should intentionally write poorly-performing code, just that you should generally write it in the most straight-forward, readable manner possible, using reasonable, well-understood algorithms and data structures. When you do this, you wind up with maintainable code (because it is easy to read and to understand), and there is a very good chance that your compiler will optimize your code for you. Virtually all modern compilers, including javac and the Java JIT compilers are optimizing compilers, and JIT compilers optimize on-the-fly at runtime.
I’ve agreed with Knuth’s advice ever since I heard it years ago, and like most folks, have had occasion, especially in the beginning of my career, to find out how correct he was, by optimizing code before bothering to profile it, only to eventually discover through profiling that the performance problem lay elsewhere, in some totally unsuspected bit of code.
So, what’s lost when you optimize code prematurely? What’s the downside? I mostly assumed that the major downside was lost developer time. The time that you spend doing the possibly un-needed optimization is time that you’re not spending adding features, or writing tests, or refactoring the code. Recently, however, I was the recipient of someone else’s prematurely-over-optimized code, and it struck me that premature optimization can have a large negative effect on the stability and maintainability of the codebase.
The code in question was somewhat time-critical, and went to great lengths to avoid storing a single object in HashMap and then looking it up in the HashMap, under certain special conditions. Since both insertion and lookup in HashMap are, on average, O(n), it is possible that the fairly complex nest of ‘optimized’ if/then/else statements might have actually performed worse on average than a simple HashMap put() and get().
What really struck me, however, was the increased complexity, and the decreased readability of the code due to the optimization. Indeed, I was looking at this code because it contained a bug. If the optimization had not been done, the simple put() and get() would have left virtually no room for a bug, and the code would have been vastly more readable, and therefore more maintainable. It would likely have also have had slightly better throughput at runtime.
Particularly for those of us working in corporate enterprises, the bottom line is, well…the bottom line. It’s ultimately all about cost/benefit ratios. The reason that we should be writing clear, maintainable code, writing unit tests, writing clear documentation, and otherwise listening to our Uncle Bob, is because in the long run, it costs less. Doing these things improves the cost/benefit ratio. Premature optimization, when it increases complexity, makes the cost/benefit ratio worse. Therefore, it truly is the root of all evil.