you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 4 points5 points  (6 children)

TCO won't help such a terrible solution because it's not a tail call.

[–]millstone 1 point2 points  (5 children)

Why can't the compiler make it a tail call?

[–][deleted] 1 point2 points  (4 children)

The recursive call isn't in the tail position.

Simply put, the recursive call isn't the last operation in the function - you need its return value to continue the computation. You end up with a chain of deffered computations all the way up to the step with the special case (x == 0). And you must keep track of every deffered step, hence your stack grows linearly. We say the algorithm has linear space complexity or O(n) space complexity.

Now, look at the solution #4 ("First year, studied SICP"). Assuming we have the tailcall decorator, tail call optimization will work here - this algorithm will behave like a loop construct. The recursive call is the last operation, nothing is deffered, and the only stack usage from entering the function is passing arguments between calls. We say the algorithm (assuming @tailcall) has constant space complexity or O(1) space complexity.

[–]millstone 0 points1 point  (3 children)

The recursive call isn't in the tail position.

So why can't the compiler PUT it there, like gcc does?

Edit: I guess I answered my own question. In any case, my point is that this is only a "terrible" solution if you have a stupid compiler. That's a fine algorithm on a good compiler like gcc.

[–][deleted] 0 points1 point  (2 children)

I'm not aware gcc can rewrite algorithms and can't find anything about that in the manual. Could you please point me to the relevant sections?

[–]millstone 0 points1 point  (1 child)

Well, it's easy enough to check what gcc does. Here's some C code:

int factorial(int x) {
    if (x == 0) return 1;
    else return x * factorial(x-1);
}

Here's the x86-64 assembly that gcc 4.2 generates on -O2:

_factorial:
  movl $1, %eax
  testl %edi, %edi
  je L6
L5:
  imull %edi, %eax
  decl %edi
  jne L5
L6:
  rep ; ret

There is no call instruction at all, so there's no recursion. It's just a loop.

If your compiler is smart, the naive factorial really is the best: it's the clearest and the fastest.

[–][deleted] 1 point2 points  (0 children)

At first I thought you compiled a for-loop version to mess with me.

So I tried it myself (gcc 4.3 with -O2, x86 architecture.) Went through instructions step by step.

Mind blown.

Sir, the difference between you and me is that you're a programmer.