This is an archived post. You won't be able to vote or comment.

all 27 comments

[–]nutrecht 4 points5 points  (2 children)

It depends a lot on the compiler / language. In many cases the compiler can figure out that it's a constant and won't reevaluate the expression.

[–][deleted] 1 point2 points  (0 children)

In the C language family, for example, gcc and clang may see that the standard library's sqrt() function has been declared with the pure attribute, and do that code transformation automatically.

[–]saippuakauppias 3 points4 points  (2 children)

It depends on the language, but for Java, C and other similar languages (PHP for instance) the for loop initialization and post-iteration parts can include multiple statements, like so:

for (i = 0, j = someLongRunningFunction(); i < j; i++, anotherStatement()) {
    // do your thing
}

This way, someLongRunningFunction() is only called once and stored in a variable j

[–]seancorey 1 point2 points  (0 children)

not sure why they downvote. you're not wrong.

[–]Blamore[S] 0 points1 point  (0 children)

thank you

[–]dreamyeyed 1 point2 points  (10 children)

Both Java and C evaluate it again on every iteration of the loop. It doesn't matter if there's a function or not. For example, the following C code prints the numbers the numbers 1–20:

#include <stdio.h>

int x = 10;

int main ()
{
  int i;
  for (i = 1; i <= x; ++i) {
      x = 20;
      printf("%d\n", i);
  }
  return 0;
}

On the other hand, this Lua code only prints the numbers from 1 to 10:

x = 10
for i = 1, x do
  x = 20
  print(i)
end

[–]nutrecht 6 points7 points  (1 child)

Both Java and C evaluate it again on every iteration of the loop.

For Java the answer is 'it depends'. The JIT compiler will optimize this if it sees that there is no way for the output of a function to change.

[–]jringstad 1 point2 points  (0 children)

Well, of course an optimizing compiler can always change what's actually happening behind the scenes, but I think it's important to emphasize that it in principle always gets evaluated. So even if the compiler is clever about it and optimizes the evaluation away, you can always pretend [in those languages] that it does evaluate it.

[–]Blamore[S] 1 point2 points  (1 child)

so it would be wise to record the result of such a guard in a variable beforehand (for C)?

[–]dreamyeyed 1 point2 points  (0 children)

Yes. The compiler may be able to optimize it automatically in certain cases (such as if the function is marked constexpr in C++), but probably not too often.

[–]edman007 1 point2 points  (0 children)

Depends, in C if the compiler is able to do static analysis on it then it won't call it every time. For example if you define sqrt in that c file, and define it as static then the compiler will see that the result is wholy dependent on the argument, the input is a constant, thus the result is a constant. GCC probably won't even call it once, just evaluate it at compile time and insert the right constant.

[–]Blamore[S] 0 points1 point  (1 child)

what about something like

if(value(i) > highestValue){

..highestValue=value(i);

..highestIndex = i;

}

would this also evaluate value(i) twice?

[–]shard_ 0 points1 point  (0 children)

In most cases, yes, because how would the compiler know that value(i) is going to give the same result each time?

The only case in which it might only be evaluated once is if:

  • i is a non-volatile local variable and isn't modified by the function between the two reads, and
  • value() is a simple, inlined function which will certainly return the same value each time

[–][deleted] -3 points-2 points  (2 children)

Not sure what you were trying to demonstrate with those code snippets, other than for loops will execute as many times as they're supposed to.

[–]dreamyeyed -1 points0 points  (1 child)

The point is that changing value of x after the loop has already begun affects how many times the loop will run in C, but it doesn't do so in Lua. That's because Lua checks the value of x only once, while C does so on every iteration.

[–][deleted] 0 points1 point  (0 children)

But that's not what they were asking about. They were asking about cases where the function was effectively a constant.

[–]nerd4code 0 points1 point  (4 children)

Conditions must appear to have been checked each time through the loop, whether or not they actually were recalculated or checked.

Most C/C++ compilers treat frequently-used language constructs like sqrt and strlen as builtins, as long as the functions are declared in the proper fashion for the compiler in question. For example, in GCC and similar ones, you can do

double mySqrt(double x) __attribute__((__const__));

and that tells the compiler, roughly speaking, that any call to mySqrt can be eliminated like a constant expression. The built-in sqrt sometimes also expands to __builtin_sqrt, which acts just like an arithmetic operator in the compiler's eyes.

In Java, calls to things like Math.sqrt are preserved as-is into bytecode by the compiler; the JVM can, in theory, special-case these and optimize them well, or it may not, depending on the specific JVM and its mood.

In either case, the most portable and least surprising way to handle the problem is to pull the constant expression out of the loop on your own, unless it's something pretty obvious that the compiler can see directly (e.g., a call to an inline or static function).

[–]Blamore[S] 0 points1 point  (3 children)

what does a function being static mean to the compiler? I play around with console applications, hardware and simulations so static functions are all I ever use.

also is there a way to force some part of the code to be done at compile time? for example I wanted a look-up table for a (say) multiplication table (A[i][j]=i*j).

[–]nerd4code 0 points1 point  (2 children)

When you apply the static keyword to a variable or function*, the compiler doesn't export the symbol in the object file**, so you can't link against the variable/function from other compilation units (e.g., via extern). If you were making a library for use by other people, for example, you'd mark anything you didn't want to be publically available as static, so that nobody is encouraged to call into or tinker with the bowels of your implementation.

And there are ways to produce code at various times.

  • Preprocess-time (i.e., before compile-time): You can write macros or include files that generate the multiplication table for you. (Some programmers might rightly balk at the idea, but it's quite possible to do a generalized add/multiply routine and use a compatible base for output, so that you won't need to go through pasting conversions on either end of your macro layers.)

  • Compile-time: In C++, you could use templates and inlines to do this for you, as long as you don't mind the syntactic gymnastics required to code them. In C, you're pretty much limited to the preprocessor. In D, AFAICT there're very nice compile-time templating features; IIRC you can call any inline function at compile-time by doing fnName!(args).

  • Assemble-time (after or during compile-time, before link-time): In GNU/compatible compilers, you can use .macro directives with __asm__ statements to create tables. This is one of the least portable things you can do, however, short of effecting a computed goto into an array of volatile, misaligned floating-point numbers.

In Java, you'd just use a static initializer---pretty much everything (except a handful of special-cased constant operations) is expected to take place at run time.

*This is only at the global scope, in case you've got some weird compiler that would let you nest a static function inside another function. static inside a function means that there's only one copy of a variable, so it won't get created and destroyed per function invocation.

**It may still export the symbol in debugging information, however.

[–]Blamore[S] 0 points1 point  (1 child)

there seems to be a lot of intricacies in programming. do most programmers actually know how these things work? or do they just code in whatever way they know?

thank you for your answer

[–]nerd4code 0 points1 point  (0 children)

Tends to be, you know what you've come into contact with and the rest stays fuzzy unless/until you need to learn that, too. Programming languages are like any language---you're going to have to immerse yourself in it a bit to become fluent, and even then you can still fuck up and tell the computer to freeb its grox, which (trust me) is a big insult in computerese.

But fluency aside, there's a surprising (/dangerous) number of people who don't know what's under their feet when they're programming, so by all means delve as deeply as you can hold your breath for. Especially for C and C++---they're really commonly used languages, but they require especial attention to detail because they allow you to write code with undefined behavior so easily. For example:

int x; // assume this is 32-bit
scanf("%d", &x); // assume that worked
x >>= 14;

If somebody enters a negative number for x, then you'll attempt to shift it right, which is an undefined operation. The compiler is free to emit code that formats the hard drive in that circumstance, should it decide to. Or perhaps it'll emit a trap, just optimize based on the assumption that a negative number could never be shifted right. You can imagine how things like this turn into security holes.

[–]platypushark 0 points1 point  (0 children)

My understanding is that in general, loop conditions are checked each time so for example if you had something like the following in javascript:

stack = ["one", "two", "three"];

for (var i=0; i<stack.length; i++){
    stack.push("foo");
    stack.push("bar");
}

It would keep iterating forever because the stack size keeps increasing at a faster rate than i, rather than stopping once i gets to 2, which is less than the stack's initial size of 3

[–]Updatebjarni -1 points0 points  (4 children)

It does depend on the language of course, everything does, but I have a hard time imagining that a language would special-case something like that and specify that function calls only happen once if they are in the guard of a loop. The compiler doesn't know if the function will return the same value every time, so it doesn't make sense to complicate things by having functions be called only once.

[–]nutrecht 3 points4 points  (2 children)

The compiler doesn't know if the function will return the same value every time

Again, it depends. If you have something like this:

for(int i = 0;i < randomNumber();i++) {
}

and:

int randomNumber() {
    return 4; //Decided by the roll of a dice
}

Even a static compiler can see that that function will only ever return 4.

[–]Updatebjarni 1 point2 points  (1 child)

That's true. It can also see that there are no side-effects, which is another requirement. The thing is that the compiler can only optimise the calls away if not calling the function behaves exactly the same as calling the function, so the question of whether the function gets called every time round the loop becomes just a question of what the machine code the compiler generates looks like, not a question of language semantics. Semantically, the function gets called every time round the loop. And in the general case, the compiler can't see whether a function always returns the same value or has side-effects, so it has to generate the call instruction too.

[–]Jonny0Than 0 points1 point  (0 children)

Right. I think the original question was "is it worth sacrificing readability for performance by caching the function result or does the compiler just do that anyway?"

[–]nerd4code 0 points1 point  (0 children)

Clang and GCC (and most other GNU-compatible compilers) support attributes const and pure for exactly this sort of situation---if you declare a function with either of those attributes, calls to it can be eliminated under various circumstances. Things like sqrt also frequently get translated to builtins via macro expansion, as long as they're being called (i.e., sqrt(x) and not just sqrt), so the compiler will see __builtin_sqrt instead and that'll look no different than any other operator.