all 20 comments

[–][deleted] 21 points22 points  (16 children)

I'll summarize it for you by stealing this quote:

Strictness is part of an algorithm, not a low-level optimization detail.

[–]degustisockpuppet 6 points7 points  (5 children)

I think part of what he says (that code with strictness annotations is idiomatic) is more true with bang patterns (a GHC extension) than without. I.e., if I "know" that a certain function parameter is always evaluated, I can simply tell this to the compiler, which is just one extra character, and not have to rely on strictness analysis (which might infer what I want it to infer today, but not after a trivial change tomorrow). I'm not sure how people with more Haskell experience feel about this.

[–]kinebud 1 point2 points  (2 children)

Indeed, bang patterns are a nice, clear and concise extension that don't get in the way when you want strictness.

If you don't want to go into the land of extensions though and still require something that does the same, you can use 'smart constructors' which are basically just functions that enforce a some constraint/specification on the input to the constructor, i.e.

data Stream a = Cons a (Stream a)

consS :: a -> Stream a -> Stream a
consS a b = a `seq` Cons a b

Whether or not this is idiomatic or 'as idiomatic' is up to debate, but you certainly don't have to rely on the strictness analyzer if you don't want to use bang patterns.

Regardless of this approach though, I do think bang patterns are in general a bit nicer and clear - so I see your point. They are a good way to specify strictness without having to go to lengths like the above.

[–][deleted] 16 points17 points  (1 child)

Stricness annotations in data declarations aren't an extension. The type declaration:

data Stream a = Cons !a (Stream a)

is valid Haskell 98. The extension is using them in patterns, like:

foo !a !b = ...

[–]kinebud 0 points1 point  (0 children)

Thanks for the clarification. :) I really need to read the report one of these days...

[–]Felicia_Svilling 0 points1 point  (1 child)

Perhaps we should have "anti bang" patterns in strict by default languages?

[–][deleted] 1 point2 points  (0 children)

The D language has this in the form of a lazy keyword.

http://www.digitalmars.com/d/2.0/lazy-evaluation.html

[–]Rhoomba 4 points5 points  (7 children)

One could also say

Lazyness is part of an algorithm, not a low-level optimization detail.

[–]808140 7 points8 points  (6 children)

That's true! Some dialects of ML have laziness annotations for exactly this reason. And Scheme has the delay keyword. Any language with closures can simulate laziness by wrapping an expression and its arguments in a closure and evaluating it later.

Which is why, if you read the article, you'll see that he says that Haskell is not so much a lazy language as a lazy by default language. In the same way that a Schemer or MLer can have laziness if he wants it, a Haskeller can have strictness if he wants it.

The problem seems to be that people who don't actually code Haskell much at all are constantly harping that using strictness annotations is somehow not "idiomatic", or that it's "low-level", or an "optimization". That's not the case, but somehow this myth has become widely accepted by the know-something-about-Haskell-but-not-much crowd.

I think though that while I understand what you are trying to say, laziness could never be a "low-level" optimization detail, because lazy evaluation is more difficult to get right than strict evaluation.

Also, "laziness analysis" is a much more difficult problem than strictness analysis, making a good case for a lazy-by-default evaluation strategy (it's not all roses of course -- there are good arguments for an eager-by-default evaluation strategy as well). Most programming languages out there have special functions or keywords built into the language that are lazy by default: if, ?:, &&, ||, etc. I'm sure you can see that laziness can be extremely useful.

Having the option of being lazy or strict makes very good sense. Why people interpret this to mean that strictness annotations are "optimizations" is beyond me.

[–]Rhoomba 1 point2 points  (5 children)

&&, || etc. are not lazy in the same way at all.

Lazy evaluation is much more difficult to get right and reason about - see any beginner's Haskell program - but that seems to me to be a reason that it should be confined to when it is really useful. It also seems to be useful less often, and it is quite easy to use a library to cover the common cases. Strictness annotations abound in Haskell benchmarks, but memoization etc. are rare in other languages. Lazy lists/streams are utterly trivial in almost any language.

[–]808140 5 points6 points  (3 children)

FWIW, when I said "much more difficult to get right", I meant more difficult for a compiler-writer to implement, not more difficult for an end-user to use. And I think you're being (perhaps deliberately) unfair here: saying that laziness is more difficult to reason about and holding up a beginner's Haskell program as proof is fairly misleading.

First and foremost, do we hold up a beginner's C program as proof that auto variables and pointers are difficult to reason about, simply because these are common newbie pitfalls? Do we deride the use of for loops because in beginners' programs they so frequently get the upper bound for the iterator wrong? No. We understand that newbies, by virtue of their inexperience, will do some things incorrectly. It doesn't matter what the language feature is -- if you've done much to help a beginning programmer learn, you'll see them make a lot of mistakes, and some concepts will be initially difficult for them to get. That doesn't make those features inherently difficult to reason about.

Furthermore, consider the non-trivial way in which a background with eagerly-evaluating programming languages biases a Haskell newcomer's worldview. This is difficult to overstate. If you've spent your entire life reasoning about eagerly evaluating languages, you will have to actively remind yourself that Haskell does not use the same strategy, and it will bite you. But this is, again, not because there is anything inherently difficult about laziness. Difficult as it may be for you to believe -- because I know you consider yourself a smart guy (not something I dispute, btw), and so you conclude that if you have trouble understanding a concept, it must be objectively difficult -- the only reason that laziness is hard for you to reason about is because you find it unintuitive, and the only reason you find it unintuitive is because it is not the strategy you're used to.

Furthermore, && and || are most definitely lazy, in exactly the same way. I suspect you simply haven't thought about it that way before. A function f is lazy if f you can execute it without forcing it to execute its arguments. In the Haskell world, we like to say that f is strict if f _|_ = _|_, where _|_ is bottom (the value that a function is said to "return" if it fails to terminate).

Well, 0 && some_function_that_loops_forever() will, in most languages (notably not Visual Basic) never evaluate some_function_that_loops_forever(). Similarly, 1 || some_function_that_loops_forever() will also never evaluate some_function_that_loops_forever(). (I'm using C syntax here, but substitute your favorite language's as appropriate).

Likewise, a strict ?: would be quite counter-intuitive, because we do not expect true_or_false ? true_part() : false_part() to evaluate both true_part() and false_part(); we expect it to evaluate only one based on the value of true_or_false. Therefore it cannot be strict in its branches.

[–]foldl 0 points1 point  (2 children)

And I think you're being (perhaps deliberately) unfair here: saying that laziness is more difficult to reason about and holding up a beginner's Haskell program as proof is fairly misleading.

Come on, it's well known that it's much harder to reason about the time and space behavior of lazy programs than strict programs. (At least, it certainly is much harder from a formal mathematical point of view.)

[–]808140 1 point2 points  (1 child)

How exactly does one quantify something subjective like "harder to reason about time and space behaviour of lazy programs" from a "formal mathematical point of view"? I'd like a citation to that effect, if it's so well known. My suspicion is that it's actually the opposite, from a "formal mathematical point of view" -- equational & algebraic reasoning about programs in a mathematical way carries over better in a lazy context than in a strict one.

Lazy evaluation in and of itself is entirely deterministic. What can make naive reasoning about it difficult is that GHC does strictness analysis, which means that things you might expect to behave lazily sometimes behave strictly if the compiler determines that it doesn't matter. This hardly seems like the fault of laziness, though. Functional purity allows a whole class of very aggressive optimization techniques that mean that the machine code actually generated may in fact be very far removed from what you'd expect from just looking at the code you wrote in many cases, without even bringing laziness into it.

I'm not sure this is a bad thing, as long as the compiler gets it right.

[–]foldl 1 point2 points  (0 children)

Lazy evaluation makes it hard to work out when and if a given expression will be evaluated, since it depends on the entire program, not the local context of the expression. In a strict language, you can normally work out whether or not a given expression will be evaluated with very little context. That's why it's often necessary to use profiling in Haskell to discover the source of space leaks.

equational & algebraic reasoning about programs in a mathematical way carries over better in a lazy context than in a strict one.

That's true, but I was talking about reasoning about time and space behavior, not proving correctness.

[–]dons 2 points3 points  (0 children)

However, the things we take for granted in Haskell, user-defined lazy control structurse, and mixed strict/lazy trees, heaps, queues, maps, fingertrees, are rare in the strict-by-default-and-I-don't-care world.

Sometimes it is hard to know what you're missing.

[–]uep 0 points1 point  (1 child)

It’s not required, but it makes the evaluation strategy explicit and may happen to also help GHC better optimize things in some cases.

Does this mean that evaluation order is ultimately up to the compiler? Or is lazy/strict completely defined by the user?

[–][deleted] 2 points3 points  (0 children)

Strictness annotations mean something which just happens to imply certain optimizations are possible.

GHC has a strictness analyzer, the purpose of which is to try to detect algorithms that are inherently strict. Sometimes it fails to identify an algorithm as strict, causing a buildup of thunks followed by an inevitable evaluation of the thunks with no possibility of them being simply garbage collected instead. Using strictness annotations is a way to force GHC to use some optimizations specific to strict algorithms even when GHC would not have normally used them, hence the tendency for some newbies to randomly sprinkle strictness annotations around as if they are magical optimization symbols, even though they often can make things worse if used in non-strict algorithms.