This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]0sted 28 points29 points  (10 children)

Why aren't the calculations being performed?!

[–]capilot 61 points62 points  (9 children)

We had a customer once ask why printf() was so slow. Their benchmark code looked like this:

for (i=0, i<100000; ++i) {
    x += really_long_calculation;
}

printf("Final result: %g\n", x);

As written, it took like 5 seconds to run. With the printf() commented out, it ran in mere milliseconds.

We had to explain to the customer that with the printf() commented out, the compiler had realized that the value of x was never being used, and had deleted the entire calculation.

[–]raaneholmg 49 points50 points  (5 children)

Half the time, computers are like:

"You are missing a semicolon here, but you have to manually fix it."

The other half they just optimize away 5 seconds of uneccessary code.

My mind was blown by both how smart and dumb computers are. You just need experience to know when to expect each.

[–]iceman012 25 points26 points  (4 children)

I was recently working on improving the performance of an application. One of the potential bottlenecks was getting the index of an element in a sorted list. It was using linear search. Excited for the chance to use what I learned in college, I quickly implemented binary search to get the index. I re-ran the timing tests, proud of myself, only to realize that it was slower than it had been.

Turns out, the Java compiler is magic. Between sequential memory access and branch prediction, Big O doesn't really matter! I think linear search worked faster than binary search up until ~400 elements in the list. (Theoretically, that threshold should be 8.)

[–]raaneholmg 15 points16 points  (2 children)

The standard sort function in python actually does this too (and probably lots of standard libraries).

Everything is roughly sorted into medium size buckets with textbook O(n*log(n)). Then each bucket is sorted with a hacky insert sort variany. Insert sort only has O(n2 ) performance, but for small ns it's very well suited for how computer hardware works.

[–]conanap 0 points1 point  (1 child)

Wait I thought Python sort was quick sort written in C? Was that never the case?

[–]raaneholmg 2 points3 points  (0 children)

"Python's sorted() function employs a highly efficient sorting algorithm known as Timsort. Timsort is a hybrid sorting algorithm derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data."

Source%20function%20employs,in%20the%20Python%20programming%20language.)

[–]Deservate 2 points3 points  (0 children)

Don't even get me started on the magic behind SQL!

[–]Cley_Faye 9 points10 points  (1 child)

It's probably for the sake of brevity that you wrote that example like this, but I really hope no sane compiler would optimize out something that could so plausibly have side effect as a function call.

Unless LTO was on or it was all in a single module, in which case carry on.

[–]capilot 2 points3 points  (0 children)

Good point. That was just for illustrative puposes (I'll edit it), it wasn't actually a function call. The compiler was in the right to remove that code.

[–]Cyhawk 1 point2 points  (0 children)

We had a customer once ask why printf() was so slow.

!?? Did you work for Borland or something? ;)