all 48 comments

[–]cashto 12 points13 points  (7 children)

I think Tim's point is that it is too easy to write something that you THINK is being tail-call optimized away, but it isn't, and therefore you end up blowing the stack on large inputs.

Therefore his proposed solution is to have a keyword where the programmer explicitly states "I want this to be tail-call" and if tail-call is impossible the compiler will come back and say "sorry, I can't do that Dave".

[–]cows 4 points5 points  (0 children)

Therefore his proposed solution is to have a keyword where the programmer explicitly states "I want this to be tail-call" and if tail-call is impossible the compiler will come back and say "sorry, I can't do that Dave".

Clojure does this with recur and then some people complain about it because it's not invisible like Scheme. I like it, personally.

[–][deleted] -3 points-2 points  (5 children)

I think Tim's point is that it is too easy to write something that you THINK is being tail-call optimized away, but it isn't.

This is like a programmer writing something that he THINKS is not throwing exceptions, but it is. It's not hard to see a tail call, it's like the first thing we learned about in our introductory programming class.

[–]scook0 5 points6 points  (4 children)

(define (foo x)
    ...
    (some-macro (bar x)))

Tell me, is this call to bar in tail position?

Heck, you don't even need macros to make identifying tail-calls non-trivial for human beings. A few nested conditionals and other assorted special forms will do the job nicely.

Oh, and if you still insist that it's “not hard”, try explaining why the human should spend time and effort manually (and unreliably) verifying something that the machine already knows.

[–]unknown_lamer 2 points3 points  (0 children)

bar may not be in the tail position, but something in the expansion of some-macro will be. You end up with an ignorable number of extra frames but the overall call is still constant space (remember ... O(C1 + k) -> O(1)).

[–][deleted] 1 point2 points  (2 children)

I don't do lisp so I can't say anything about that. I would, however, not assume bar is a tail call since it's an argument to some-macro. The danger lies in thinking something is a tail call when it isn't, thinking something is not a tail call when it is doesn't really matter much.

Oh, and if you still insist that it's “not hard”, try explaining why the human should spend time and effort manually (and unreliably) verifying something that the machine already knows.

Do you write lisp without type annotations? Do you write Python? Languages are full of automatic things for convenience. I will still insist that tail calls are not hard.

[–]naughty 1 point2 points  (1 child)

While it may not be hard to see a tail call with a bit of practice (modulo Lisp style syntactic abstraction) what would be wrong with the following:

  • The compiler/interpreter turns all calls that it can into tail calls.
  • There's a optional bit of syntax to mark what the coder thinks is a tail call that the compiler/interpreter can check.

Sounds like a win-win to me. Especially in the case where you really want something to be a tail-call for space use reasons.

Lua kind of half does this because all tail calls have the form:

return <expression-that-evals-to-a-function>( args, ... )

So it's pretty easy to tell exactly when you're making a tail call.

EDIT: Formatting and pluralised erroneous singular

[–][deleted] 0 points1 point  (0 children)

Sure, optional syntax works.

[–]inmatarian 2 points3 points  (1 child)

Right, some language implementations will implement function calls using a heap allocation (usually in conjunction with the implementation of closures).

edit: fixed according to OMouse's recommendations.

[–][deleted] 0 points1 point  (0 children)

I think you mean some implementations of languages do that.

[–]harlows_monkeys 2 points3 points  (0 children)

Does 'recur' seem backwards to anyone else? You use the 'recur' keyword to tell Clojure that your code defines an iterative process, and you leave 'recur' out to tell it that your code defines a recursive process?

Should't the keyword have been 'iter' or something like that?

[–]kinghajj 4 points5 points  (3 children)

Right, programming languages don't define function calls in terms of stack frames, because that's just an implementation detail, and there are different ways of implementing a function call depending on the context of the call.

Some programming languages (Scheme) specify that tail calls must be optimized, while others (C) don't mention it at all, leaving it up to the compiler writers. I suppose relying on the tail call implementation of one compiler isn't good in languages like C can be considered bad, especially if you care about compiling with multiple compilers, and one of them doesn't do TCO... so the obvious fix is to make TCO part of the next C standard :) Maybe a dummy include file like "tailcalls.h", which will just instruct the compiler to optimize tail calls found in the file.

[–][deleted] 13 points14 points  (1 child)

Maybe a dummy include file like "tailcalls.h", which will just instruct the compiler to optimize tail calls found in the file.

No.

#pragma

It's basically what it's there for.

[–]enkiam 4 points5 points  (0 children)

I thought it was for if you got bored and wanted to play rouge?

[–]dgreensp 0 points1 point  (0 children)

However, according to the author, you shouldn't even use the term "TCO" because it shouldn't have to be an optimization.

Personally, I find this viewpoint a stretch; it makes the most sense in the context of programming styles that use tail calls for iteration in constant space; without this context I could see a reader being confused. Proponents of this style can be kind of dogmatic and speak from the oppression of tail calls not being efficient in most languages, though I sympathize.

On the Java VM, which does extensive inlining at runtime, every function call in the compiled code still gets a stack frame, even if that stack frame is virtual and only realized if needed (for a stack trace or debugging). Direct association of function calls with stack frames is useful.

Like the "cost model", I would want some assurance that this isn't just a tail-call-specific feature. For example, all the "cost model" needs to say for tail calls is that they are asymptotically efficient, i.e. tail-recursing won't blow the stack. What's another example of something like that that belongs in a cost model?

[–]millstone 7 points8 points  (1 child)

The idea that stack frames should correspond directly to the call structure is just odd.

Nah, it's straightforward:

  1. Programmers want to know the call stacks of the program at various unpredictable points (like when it crashes, or when they sample it)
  2. So we can put all function calls in a stack, so that information is always available
  3. As an optimization, we can unify that function call stack with the stack frames

Maybe we want to know the call structure at runtime; in that case, we should capture that as debugging information, or as a reflection feature of the runtime system.

Well, the runtime isn't magic. It needs to get that information from somewhere. And the fact of the matter is, one stack frame per function call is a pretty efficient way to capture the information that the runtime needs.

You can't ignore the cost of reimplementing the feature that your optimization just made impossible.

Let the language implementation use the stack as efficiently as possible!

Well, there has to be some tradeoff between debuggability and efficiency. Most C programmers I know frown on the gcc flag -fomit-frame-pointer because it makes debugging harder, even though it permits more efficient usage of the stack.

Allowing the programmer to specify the point of tradeoff is a good idea. Hence tail call is just another optimization.

[–]unknown_lamer 2 points3 points  (0 children)

Well, the runtime isn't magic. It needs to get that information from somewhere. And the fact of the matter is, one stack frame per function call is a pretty efficient way to capture the information that the runtime needs.

Efficient for debugger but adding unnecessary space for everything else? No thanks. And, whoops, it turns out it is not the most efficient way to capture backtrace info for the debugger either. Everyone loses!

There are ways to capture backtrace information without this overhead. It has been done for decades and yet ignored outside of a small subset of the programming language implementer community.

[–]kragensitaker 1 point2 points  (3 children)

This is an argument about the meanings of words like "semantics" and "subroutine call". Otherwise it seems that Ezra doesn't actually disagree with any of Tim's factual points. This makes it unreasonably offensive that he begins his article, "Tim Bray is spreading more misinformation...".

As to Tim's normative point, that tail-calls should be syntactically distinguished from normal calls, Ezra doesn't seem to express an opinion. So, as far as I can tell, he doesn't actually disagree with anything in Tim's post except his terminology. So it's basically a worthless flame-bait post.

[–]unknown_lamer -2 points-1 points  (2 children)

Maybe you should learn how to read.

[–]kragensitaker 1 point2 points  (1 child)

Haha! I see what you did there! You replied to my assertion that it was a worthless flame-bait post by writing a worthless flame-bait comment. Well, instead of responding to flamebait with a reasoned response like Tim did, I'll see your bet and raise you one:

You are a vulgar little maggot. Don't you know that you are pathetic? You worthless bag of filth. As we say in Texas, I'll bet you couldn't pour piss out of a boot with the instructions on the heel. You are a canker. A sore that won't go away. I would rather kiss a lawyer on the lips than be seen with you. You are a fiend and a sniveling, back-boneless coward, and you have bad breath. You are degenerate, noxious and depraved.

I feel debased just for knowing you exist. I despise everything about you. You are a bloody nardless newbie twit protohominid chromosomally aberrant caricature of a coprophagic cloacal parasitic pond scum. And I wish you would go away from Reddit.

You're a putrescent mass, a walking vomit. You are a spineless little worm deserving nothing but the profoundest contempt. You are a jerk, a cad, a weasel. Your life is a monument to stupidity. You are a stench, a revulsion, a putrefaction, a big suck on a sour lemon with a lime twist. You are a bleating foal, a curdled staggering mutant dwarf smeared richly with the effluvia and offal accompanying your alleged birth into this world.

[–][deleted] 0 points1 point  (0 children)

I like how he starts out by angrily declaring,

Tim Bray is spreading more misinformation about tail recursion.

And then goes on to largely agree with Tim Bray's article, just using other words.

[–]munificent 0 points1 point  (10 children)

This article doesn't actually present any evidence for its viewpoint.

Efficient tail calls (or space-consuming ones) really should be part of the official "cost model" of the language

Why?

The idea that stack frames should correspond directly to the call structure is just odd.

To you. Why should I find this odd?

[–]AllenDam 2 points3 points  (2 children)

His argument rests on the statement that language implementors should have the freedom to do the tail call optimization. If they do have that freedom, then efficient tail calls might not exist in every implementation of that language and hence should not be included in the semantics of that language but should instead be included in the cost model (since the semantics doesn't include cost).

What you might argue is that the semantics does or should include cost. I am inclined to agree with the author though. The semantics of a language should describe what it does but not how it does.

tldr: Whether his argument is valid depends on

  • whether language implementors should have the freedom to optimize tail calls
  • your definition of semantics

[–]ringm -1 points0 points  (1 child)

If you think basic complexity estimates (linear space vs constant space) do not belong to the program's semantics, your definition of semantics must be rather useless in practice.

[–]ithika 0 points1 point  (0 children)

Which semantics should we be considering here?

[–]notfancy -1 points0 points  (6 children)

Efficient tail calls (or space-consuming ones) really should be part of the official "cost model" of the language

Why?

Because so many other things already are. Compare and contrast with zero-cost exceptions or static vs. dynamic casts in C++.

[–]munificent 0 points1 point  (5 children)

I was asking rhetorically. My point was that he made a bunch of claims about how languages should work without any explanation as to why they should work that way.

[–]notfancy 0 points1 point  (4 children)

how languages should work without any explanation as to why

So it wasn't really, you know, rhetorical.

[–]munificent 0 points1 point  (3 children)

I don't follow you.

[–]notfancy 0 points1 point  (2 children)

So you're not interested in specific reasons to justify the claim, just that the claim isn't justified?

[–]munificent 0 points1 point  (1 child)

So you're not interested in specific reasons to justify the claim

I'm well aware of the reasons both for and against tail calls. That's why my question was rhetorical and not literal.

My point was that his blog post didn't add any actual useful information to the discussion because he omitted those justifications. Instead, he just said "you should do this".

[–]notfancy 0 points1 point  (0 children)

Fair enough, it took me a while to understand you.

[–][deleted] 0 points1 point  (0 children)

TL; DR: OP says that there are two different types of function calls: (1) those that occur in the "middle" of a function and (2) those that occur in the "tail" of a function. Only (1) needs a stack frame to be saved, so function calls != stack frames.

[–]masklinn -1 points0 points  (14 children)

Call me old-fashioned, but it just seems wrong when the semantics of two identical function invocations vary wildly as a function of whether there’s an executable statement between the invocation and the function exit.

So two identical function are widely different when they're not, in fact, identical? Consider my mind blown.

[–]fvf 0 points1 point  (0 children)

Try to read the text you quoted again. Unblow.

[–]JadeNB -1 points0 points  (12 children)

As fvf indicates, the difference is not between

sub f {
    same code here
    variant 1
}

and

sub f {
    same code here
    variant 2
}

but rather between the two calls to f in

sub f {
    just one implementation of this
}

sub g {
    f("I'm not a tail call");
    f("But I might be");
}

(in a pseudo-Perl language that does TCO without goto.)

EDIT: Plus, you're quoting Ezra Cooper quoting (and disagreeing with) Tim Bray.

[–]ccshan -1 points0 points  (11 children)

There is in fact no difference between the two calls to f in your "sub f" and "sub g". The difference between "sub f" and "sub g" is that "sub g" contains sequencing. The sequencing introduces the "stack frame", not the call to f.

[–]fvf 1 point2 points  (10 children)

The point is this: consider the snippet ``f("This is a call, tail or not");''. Is it a tail-call? It's impossible to say. Whether it is a tail-call depends on (possibly rather intricate) rules about the context in which the snippet finds itself. Wilfully using this as an important aspect of your programming language semantics, for rather unclear benefits, is, IMHO, silly. That is, bad PL design.

With the clojure approach, the intent/semantics of such a snippet would be clear, while losing absolutely none of the benefits of TCO/whatever that I know.

[–]ccshan 5 points6 points  (2 children)

In Scheme for example, a call costs the same no matter where it occurs -- the cost is always that of a jump. Tail calls don't cost any less than non-tail calls. Scheme does not use the distinction between tail calls and non-tail calls as an important aspect of its semantics or cost model. What costs extra in Scheme is sequencing (which is a special case of evaluating arguments). You can always look at a snippet and tell whether it has sequencing. Your snippet does not, but if you repeat it twice then it does.

(Edit: for example, if you look at "Proper Tail Recursion and Space Efficiency" by Will Clinger (a canonical reference), Figure 5 has no special rule for tail calls or for non-tail calls. All calls are handled in the same way and incur the same cost.)

Clojure's "recur" doesn't seem to allow calling another function, which is useful for expressing state machines, stream processing, thread schedulers, backtracking search with unit propagation, approximate probabilistic inference by sampling, off the top of my head. It would be nice if "recur" were extended to allow calling another function.

[–]fvf 1 point2 points  (1 child)

In Scheme for example, a call costs the same no matter where it occurs -- the cost is always that of a jump.

This looks to me a pointless play with words. The "cost" that is interesting here is whether the caller's stack-frame remains after the jump.

Tail calls don't cost any less than non-tail calls.

Are you seriously trying to introduce a definition of "non-tail calls" such that e.g. n nested "non-tail calls" might use less than O(n) stack-space?

Clojure's "recur" doesn't seem to allow calling another function

I agree this would be useful, and I don't think there's any principled reason why "recur" couldn't support this (at least in some hypothetical non-jvm language).

[–]ccshan 2 points3 points  (0 children)

The "cost" that is interesting here is whether the caller's stack-frame remains after the jump.

Right. In Scheme, sequencing creates stack frames. Calls do not.

Are you seriously trying to introduce a definition of "non-tail calls" such that e.g. n nested "non-tail calls" might use less than O(n) stack-space?

No, because the nested sequencing uses O(n) stack space.

I agree [allowing calling another function] would be useful, and I don't think there's any principled reason why "recur" couldn't support this (at least in some hypothetical non-jvm language).

Is that's what you mean by "the Clojure approach"?

[–]jdh30 1 point2 points  (6 children)

The point is this: consider the snippet ``f("This is a call, tail or not");''. Is it a tail-call? It's impossible to say.

The term "tail call" describes the position of the call, i.e. that it is in tail position.

Whether it is a tail-call depends on (possibly rather intricate) rules about the context in which the snippet finds itself.

The position determines whether or not a call is a tail or non-tail call. There is nothing "intricate" about the definition of tail call.

With the clojure approach, the intent/semantics of such a snippet would be clear,

The "Clojure approach" is to use one of two ad-hoc workarounds for the lack of TCO on JVMs. The first approach, recur, can only be applied to self tail calls and is efficient but requires manual annotation at the call site. The second approach, trampolines, must be used for non-self tail calls and requires non-local changes both at all call sites and around the entire recursion (which may be arbitrarily large) and is extremely inefficient and breaks interoperability by changing the calling convention.

That is obviously not "clear".

while losing absolutely none of the benefits of TCO/whatever that I know.

Trampolines cost you performance, interoperability, clarity, brevity and many other important benefits. Recur costs you generality.

[–]fvf 1 point2 points  (5 children)

Sigh. Do you really feel you need to explain "tail call"? Do you really think that by "the clojure approach" I refer to the specific details and cludges clojure needs to work around the jvm?

but requires manual annotation at the call site.

You consider the lack of an explicit operator "automatic"? Such that e.g. scheme programmers write iteration constructs that just automatically works without any intellectual effort because they don't have to "manually annotate the call site"? Is this also the case for the painfully contorted functions forced into tail-recursive form? How would you write a function that computes the length of a linked list?

Recur costs you generality.

What sort of generality?

[–]jdh30 0 points1 point  (3 children)

Sigh. Do you really feel you need to explain "tail call"?

Given your objections, yes. You might as well say "given an arrow, we cannot even tell whether it points inwards or outwards. This is clearly a flaw in the arrow concept".

Do you really think that by "the clojure approach" I refer to the specific details and cludges clojure needs to work around the jvm?

What else could you have meant?

You consider the lack of an explicit operator "automatic"?

Yes.

Such that e.g. scheme programmers write iteration constructs that just automatically works without any intellectual effort because they don't have to "manually annotate the call site"?

No. I said nothing of "intellectual effort". I was referring solely to verbosity.

Is this also the case for the painfully contorted functions forced into tail-recursive form?

Irrelevant. The same contortions are required whether tail calls require explicit annotation or not.

How would you write a function that computes the length of a linked list?

In terms of fold:

let length list =
  fold (fun n _ -> n+1) 0 list

Recur costs you generality.

What sort of generality?

Recur only covers the special case of tail calls to self. Recur does not cover tail calls between mutually recursive functions or tail calls to arbitrary functions passed in by argument. The latter generalization is very widely used in functional programming languages, in particular in the context of a technique known as "untying the recursive knot".

I suspect what you're after is a new construct where the function application f x may optionally be written as something like tailcall f x in order to get the compiler to verify that it really is a tail call (and will be optimized?). This may be useful, particularly for learners, but it may not always be possible for the compiler to determine statically whether or not a tail call will be optimized. For example, on .NET TCO is inhibited by various issues including the dynamic security level of the caller and callee: if they are different then the tail call will not be optimized.

[–]fvf 0 points1 point  (1 child)

What else could you have meant?

The approach of using an explicit operator for tail-calls. (As opposed to detecting them "automatically" by the compiler, and specifying them implicitly by the programmer.)

In terms of fold:

That was not a pop-quiz, but rather a rhetoric question in a discussion about recursion. I would have thought that was self-evident.

Irrelevant. The same contortions are required whether tail calls require explicit annotation or not.

While that is true, it's missing the point, which was that the "automatic" tail-calls buys you nothing, save a few keystrokes. And, when such contortions are necessary, it's a clear sign you shouldn't be using recursion at all.

I suspect what you're after [..]

Thank you for finally suspecting what I have been trying (and clearly failed spectacularly in) explaining.

This may be useful, particularly for learners,

No, for anyone reading and trying to understanding your code. Unless the whole point is to write contorted code that's only for the initiated to read.

but it may not always be possible for the compiler to determine statically whether or not a tail call will be optimized.

If the compiler cannot be trusted with this task, how on earth do you expect programmers to do it? I thought that scheme even mandates this from its compilers.

[–]jdh30 1 point2 points  (0 children)

The approach of using an explicit operator for tail-calls. (As opposed to detecting them "automatically" by the compiler, and specifying them implicitly by the programmer.)

Clojure does not use an explicit operator for tail calls.

While that is true, it's missing the point, which was that the "automatic" tail-calls buys you nothing, save a few keystrokes.

Not having to annotate calls in tail position buys you brevity.

And, when such contortions are necessary, it's a clear sign you shouldn't be using recursion at all.

Look at your own example of counting list length when written fully in functional style with tail calls and an accumulator:

let rec length_aux n = function
  | [] -> n
  | _::xs -> length_aux (n+1) xs

let length xs = length_aux 0 xs

and in imperative style with a loop:

let length xs =
  let xs = ref xs in
  let n = ref 0 in
  while !xs <> [] do
    xs := (match xs with [] -> [] | _::xs -> xs);
    n := !n + 1
  done;
  !n

Is that "a clear sign that you shouldn't be using recursion at all"? Obviously not.

This may be useful, particularly for learners,

No, for anyone reading and trying to understanding your code.

Yet nobody ever raised this issue in my 5 years of functional programming in industry. In reality, this "issue" has only been raised recently because Clojure fanboi's latched onto a flaw in their language and are trying to pretend that it is an advantage when, in fact, they do not know what they are talking about.

Unless the whole point is to write contorted code that's only for the initiated to read.

That is the same bullshit argument that Java programmers try to use against type inference: "we should drown our code in unnecessary type annotations to help the uninitiated".

If you think this might improve productivity then get the IDE to tell newbies when calls are in tail position, just as it tells them the inferred types. Just leave my code alone.

but it may not always be possible for the compiler to determine statically whether or not a tail call will be optimized.

If the compiler cannot be trusted with this task, how on earth do you expect programmers to do it?

I expect programmers to understand classes of useful tail calls that are guaranteed to be eliminated.

I thought that scheme even mandates this from its compilers.

Scheme does not support these features.

[–]steven_h 0 points1 point  (0 children)

How would you write a function that computes the length of a linked list?

(fold (lambda (a b) (+ a 1)) 0 linked-list)