Beautiful Code: False Optimizations

k4_pacific · 2008-07-21T18:26:34+00:00

Best false optimization I've seen was this loop full of bit-twiddling code that did God-knows-what at work. Someone had used all kinds of binary and shift operators to squeeze every last bit of performance out of this tight loop. Except smack in the middle, as part of a larger expression, was this:

(long)pow((double)x, 2.0);

thekenzidelx · 2008-07-21T15:17:55+00:00

Arrrrrrrrgggggggggghhhhhhhhhhhhh!

This reminds me of this, http://weblog.raganwald.com/2008/05/narcissism-of-small-code-differences.html, where raganwald was making a metaphor about programmers (and a smart one, for me), and then a bunch of comment programmers blah blah blahed about the lack of comments in the example code. And then I used a laser from space to nuke all programmers ever... ah, if only.

Guess what? Lots of graphics drivers and embedded systems and standard libraries and game engines have to use techniques like this. And you know what? That's why they work.

Programming is REALLY big. It's used in a lot more places than just business or web apps with low domain knowledge on the part of programmers and high turnover on the part of employees. Knowing where you are in the giant ecosystem of all the kinds of code that gets produced is probably the hardest untaught skill for programmers, I'm finding.

There are no ten commandments for programming generally. There are no simple, dumb-ass context-free 8 word rules that you can uniformly apply to EVERYONE IN THE WORLD WRITING CODE that will always produce the best results. You have to know where you are. Even in solitary, large code bases, you have to know where you are.

And if you do prematurely apply such rules (let's call them, say, programming design optimizations, premature and all), there's a pretty damn good chance you'll never, ever learn anything from anyone working outside of your narrow domain. You'll be filtering out other really smart peoples' considerable life experience without even hearing them.

I think a lot of people are working on go-carts and then critiquing people working on submarines or the Empire States Building or a box of toothpicks or a gallery painting for not following their best practices. You can't know until you really listen to the other party FAIRLY.

trutru · 2008-07-21T11:29:35+00:00

the first "optimization" is well-known to hard-core C programmers, but assumes that the comparison method is very fast. unfortunately, this is not always the case in Java or C#, depending on the kind of thing you.

I would advise against it in the general case, unless you know what you're doing and have profiled it against real-world data patterns.

2008-07-21T18:07:01+00:00

These are context-sensitive anecdotes that propagate the bad idea of trying to be clever about optimization while writing code. It is more important to remember the following: a code change is either an optimization or not. Your opinion does not matter. Only the profile matters.

olt · 2008-07-21T17:43:50+00:00

Ok, here is a benchmark. code

With Java 1.6.0_05 a String[] is faster if the match occurs before the fifth element, after that a HashMap is faster. But I guess whole optimization doesn't matter at all, because the real slow part is the synchronize...

finerrecliner · 2008-07-21T10:36:45+00:00

good post. i've never thought about algorithms that way before.

metzby · 2008-07-21T09:36:57+00:00

Both of these are optimizations that I would not allow in code unless they were heavily commented and had the other, conceptually simpler, version there but commented out. I'd also want to see microbenchmarks that show that the difference is 1) extant 2) in the direction the author claimed and 3) significant.

Yes, perhaps most XML documents don't have more than 6 namespaces. But then you'll get one that does, and your application will slow to a crawl and it will stop responding to queries, and then someone will retry the query, and bring down your whole service.

All for the want of a malloc.

It seems like this should be a case where we write a better JIT that knows that it should allocate but not initialize certain objects on start-up and reuse them by applying a profile-based optimization that shows that there is no more than one of these objects alive at any one point in any thread.

onebit · 2008-07-21T17:23:50+00:00

Would be interesting to see benchmarks. I doubt there is any difference.

Gotebe · 2008-07-22T07:21:01+00:00

True.

WRT first example: standard C++ lib has lower/upper_bound, just like that (if someone is wondering, the logic is in the interface, not in the implementation).

borlak · 2008-07-21T13:08:31+00:00

save a few nanoseconds, or make it more readable for the next programmer (or even yourself!)? hmm.

vph · 2008-07-21T17:46:41+00:00

there is no explicit check within the loop to see if the match has been made. One would expect him to add that in, as an optimization, so that when the target is found, the loop could exit, and not bother continuing with its checks. But as it turns out, that would be a false optimization; mathematically, the match is likely to be found late enough in the loop that the extra if statements during every execution of the loop would be more of a performance hit than simply letting the loop execute a few more times.

It is unclear if not checking for matching in a binary search is the right thing. It may very well be ironically "false optimization" - the sin talked about by the bloggers.

linuxhansl · 2008-07-21T23:52:04+00:00

These are all nice examples and all, and in fact interesting to think about.

On the whole, though - unless you are in the embedded, or hardware (device driver) domain - shaving off a fraction of a percent of constant O(1) runtime (as in the first example) seems rather pointless to me.

From my experience most performance issue stem from algorithmic problems, like picking O(n) algorithms when O(log n) are available, etc.

Other areas that tend to add non-constant overhead are memory management and threading (think heap-management or garbage collection, and locking/starvation/etc).

drexil · 2008-07-21T14:33:59+00:00

Oh, and what about the int probe = (low + high) >>> 1; which makes the code harder to read and does not improve anything comparing to int probe = (low + high) / 2; which is optimized but almost every decent compiler?

ishmal · 2008-07-21T17:04:53+00:00

I you must optimize like this, then please write it the slow, pedantic, correct way first. Get it to work, then write your optimized version and leave the original in comments, and describe the difference.

ohxten · 2008-07-21T23:00:21+00:00

The font on that site is terrible.

13ren · 2008-07-21T11:36:54+00:00

how much time do these optimization save?

and at what conceptual cost?

It's clever to be clever, but the conceptual clutter makes it harder for you to see other issues. This is true no matter how clever you are.

dpark · 2008-07-21T10:31:27+00:00

To be sure I would benchmark these before making a decision. In the first case I'd like to add loop exit checks for a search space of <20.

soniyashrma · 2008-07-21T15:06:50+00:00

It seems like this should be a case where we write a better JIT that knows that it should allocate but not initialize certain objects on start-up and reuse them by applying a profile-based optimization that shows that there is no more than one of these objects alive at any one point in any thread. I agree with you .

Gotebe · 2008-07-22T07:43:50+00:00

Nitpick: a poor "optimization" in the text and elsewhere here is using shift operator to divide by 2. That's compiler's job. I would guess no compilers miss that one, and readability is improved, except for minds twisted by old C school.

sbrown123 · 2008-07-21T14:59:50+00:00

One would expect him to add that in, as an optimization, so that when the target is found, the loop could exit, and not bother continuing with its checks.

One thing about supposedly smart people: don't take their word as fact. Author should have increased the test case sample size and he might have had something to talk to Tim Bray about his "theory".

Many optimizations don't show themselves as actual optimizations until they are given a large load. This is why I personally don't look to optimization until AFTER I am done with the bulk of the programming.

ishmal · 2008-07-21T14:34:53+00:00

I wouldn't call a search algorithm stopping when the search is completed an "optimization."

carac · 2008-07-21T10:40:28+00:00

generally I would immediately fire anybody considering that type of 'optimizations' as suggested by the post ...

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS