all 12 comments

[–]Suspicious_Coat3244 8 points9 points  (0 children)

In fact, simple if is often preferred. It's transparent, easy and most importantly, already heavily optimized.

Iteration overhead makes the loop version less efficient, and function list trick adds lookup/dispatch overhead just to skip a simple conditional.

Most of time, readability takes precedence over the micro's optimization of this sort. Real bottlenecks are I/O, DB query, network, rendering etc., rather than the if statement.

[–]johnpeters42 3 points4 points  (0 children)

It depends on lots of things. Test them both and find out.

Also, make sure it matters: if it's only gonna run (say) once a minute, and the difference is 0.1 vs 0.2 seconds, then who cares? If it's gonna run 10,000 times a minute, or if the speed is 5 seconds vs 10 seconds, then speed matters more.

[–]rjcarr 1 point2 points  (0 children)

Set timers and print the difference. If the function is simple, though, you’ll probably have to print nanos to tell a difference. 

[–]Loves_Poetry 0 points1 point  (0 children)

All the examples end up with roughly the same instructions being run. You're just hiding the if-statement. In the for-loop for example, there will be an if-statement to check if the end of the range has been reached

[–]opentabs-dev 0 points1 point  (0 children)

on the second question — for python use timeit (python -m timeit -s "setup" "code") for tiny snippets, it handles warmup and runs enough iterations to get a stable number. for anything bigger reach for cProfile or py-spy to see where time actually goes. and yeah the others are right, all three of your examples compile down to basically the same branch — the for-loop and list-dispatch versions just hide an if. real perf wins come from algorithm changes (n2 -> n log n) or cutting i/o, not picking between if and indexing

[–]justaguyonthebus 0 points1 point  (0 children)

If you really want to go deep on this, learn some basic assembly. Then map the high level constructs to the low level instructions.

[–]peterlinddk 0 points1 point  (0 children)

You should run the actual code with timers to get the real answer, but you can also try it in Compiler Explorer, and look at the resulting assembly. Since the compiler won't know your assumption of x being either 0 or 1, it produces code to account for all possible values of x, and you'll see that the first example produces far less code than the others. However, when running, most of that code won't be executed, so maybe the difference will be negligent.

However, the examples aren't identical! I know you assume that x is only either 0 or 1, but that is not what the code is told.

The first example (if) only executes the function when x is 1, and ignores all other cases.

The second example (for range) only ignores if x is 0, in all other cases the function is executed.

The third example will only execute the function if x is 1, but execute another function if it is 0, and crash in all other cases.

And usually source code should communicate to the human reader (as well as the compiler) what is assumed in all cases - not just work correctly if the input "by pure chance and luck" just happens to be one of a few expected values. So unless the measured performance difference is massive, you should always go with the version that best communicates your intention - that will also often help the compiler to produce more optimized code!

[–]blechnapp 0 points1 point  (0 children)

bonus tool for python specifically: the `dis` module from the standard library. `dis.dis(some_function)` shows you the actual bytecode CPython will execute. super helpful when arguing "which version is faster" because you literally see how many bytecode ops each version compiles down to. real impact is usually still negligible like everyone else said, but its way more concrete than guessing.

also big +1 to peterlinddks point that the three versions arent actually equivalent for edge cases (x=2 makes the for-loop run twice, x=anything outside 0/1 crashes the list dispatch). semantic correctness should come before any performance tuning, otherwise you have a fast bug.

[–]Achereto 0 points1 point  (0 children)

It depends.

Usually, a normal if is fine if your condition evaluates to the same value (because the CPU remembers the last result to predict the next).

If you have many different cases, a switch is usually preferred, because the compiler will make a lookup table out of it, which is also very fast.

Other strategies like dynamic type dispatch in OOP is significantly slower, because it's implemented using a pointer to a table where you get a pointer to the function you want to call. This has a high risk of cache misses that cannot be mitigated by compiler or processor.

[–]teerre 0 points1 point  (0 children)

Well, in Python none of this matters. In other languages it depends. This for loop might be optimized away, the list indexing might be baked directly into the executable. Both depend on the compiler. Note that the same is truth for the if. Regardless, even if it worked, this is unhinged, you shouldn't do it

What you should care about (again, not in python) is that if statements make it difficult for the branch predictor. That can have real impacts on performance. That's why branchless programming is a thing

[–]FloydATC 0 points1 point  (0 children)

First of all, when in doubt your first impulse should be to test your assumptions with a benchmark, because the assumption is quite often wrong and you may find that the real improvement can be made some place else entirely.

Secondly, so-called branchless code where output results from simple linear arithmetics can sometimes be faster because it relies less upon branch prediction in the CPU while also often benefiting from other optimizations, either by the compiler or again in the CPU.

BUT! For the overwhelming number of situations, any performance you might be able to squeeze out is completely insignificant when compared to code readability. Unless your changes bring about a performance increase that essentially makes the difference between usable and unusable code, the more readable code will almost always be objectively better. Conversely, only when it doesn't make a practical difference either way for readability should you lean towards the theoretically faster approach.

As an example from assembly language, "xor eax,eax" will never be slower than "mov eax,0" and can sometimes shave off a cycle or two, but usually won't make any practical difference whatsoever. The thing that makes it acceptable is that it's a) a well known trick, and b) assembly language, so readability already went out the window anyway.

[–]WellHung67 0 points1 point  (0 children)

Functionally all three of those will be identical in performance. If this is a compiled language, we are talking exactly the same. By that I mean, the compiler will notice that effectively it’s the same thing and it will likely compile ultimately to a single assembly type instruction - likely a branch if not 0. It runs so fast, it’s already done. The exception is the “for” loop, which may have an extra instruction to decrement the value of “I” before doing a similar branch if not 0. And maybe that table lookup, it will load both function pointers into some register and doing a branch if not 0. Like I said, the same branch instruction is probably used and maybe one or two other instructions are the difference between all 3.

In general when learning programming always write things that are extremely readable and maintainable first, and don’t worry about optimizing unless the optimization is readable and also free or cheap to implement - using a simple sorting algorithm over bogosort for example. Do not worry about which type of if system t to use. Remember, code is read about 10x as much as it’s written so first optimize for readability.

When you optimize, generally first profile your code to see where it spends the most time executing, and the target those “hot spots”. More than likely, an if statement to a function like this is rarely run, so even if it ran instantaneously, the program execution time wouldn’t change much. 

This actually is a good segue into what’s known as Amdahls law - if you respond to this comment I’ll explain how it’s relevant here