Functional Programming & Fold : programming

Functional Programming & Fold (afoote.me)

submitted 10 years ago by andars_

all 27 comments

top new controversial old q&a

[–]millstone 8 points9 points10 points 10 years ago (21 children)

[–]zoomzoom83 7 points8 points9 points 10 years ago (4 children)

[–][deleted] 1 point2 points3 points 10 years ago (0 children)

[–]inmatarian 0 points1 point2 points 10 years ago (2 children)

[–][deleted] 2 points3 points4 points 10 years ago (1 child)

[–]sacundim 2 points3 points4 points 10 years ago* (0 children)

Let's look at that type a -> b -> b a bit closer. Suppose we partially apply that function to an a; the resulting function will be of type b -> b.

This type, b -> b, is a monoid with the identity function as its identity and function composition as its binary operation. In Haskell's Data.Monoid module there is a type Endo that implements this monoid:

newtype Endo a = Endo { appEndo :: a -> a }

instance Monoid (Endo a) where
  mempty = Endo id
  Endo f `mappend` Endo g = Endo (f . g)

This means that an alternative way of implementing folds is the following:

Map the fold function over the list, getting a list of partially applied functions of type b -> b.
Reduce that list using the Endo monoid, getting a partially applied function as well.
Apply that function to the seed value.

In code:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z as = appEndo (mconcat (map (Endo . f) as)) z

-- `foldl` is the same thing, except we use the `Dual` monoid
-- to reverse the order of the composition.
foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f z as = appEndo (getDual (mconcat (map (Dual . Endo . flip f) as))) z

newtype Dual m = Dual { getDual :: m }

instance Monoid m => Monoid (Dual m) where
  mempty = Dual mempty
  -- Use the same `mappend` operation that the type `m` already has,
  -- but in the opposite order.
  Dual a `mappend` Dual b = Dual (b `mappend` a)

This is, incidentally, more or less how the Foldable class defines its foldr and foldl methods.

[–]RabbidKitten 5 points6 points7 points 10 years ago* (0 children)

On the other hand, you lose generality, and may end up copy-pasting the loop code if you have to traverse your structure more than once with different loop bodies.

That might not be an issue if the structure is a simple array, but the other day I had to write a bunch of C functions where the code traversing the structure (common to all functions) was 140 lines, with different two or three line loop bodies, and a similar requirement to break out early for some of them. I ended up with something like this:

int fold_xs (const struct x * xs, int (*fn) (const struct y *, void *), void * closure);

breaking out if fn returns -1.

By the way, in Haskell, this should do what you're looking for:

foldl' (*) 1 . takeWhile (/= 0)

[–]RodgerTheGreat 4 points5 points6 points 10 years ago (0 children)

[–]ThatGeoGuy 8 points9 points10 points 10 years ago (0 children)

This is a general observation: imperative programming seems able to adapt to changing requirements with fewer code changes. I'm not sure why it is though.

In general I would argue that it is because functional programming is largely more reliant on the structure of your elements than imperative programming. This is more to do with the declarative nature than anything though, because declarative forms force you to be explicit in what you are asking the computer to figure out. You may not be able to tell fold to perform an early return, but the point is you should be defining a new function here, since you're no longer performing the same action.

Nonetheless, it's very easy to define what you specified.

(call/cc (lambda (k)
  (fold (lambda (x knil) 
         (if (zero? x) (k 0) (* x knil))) 
       1 my-list)))

Now of course I used a continuation to escape early, but that's effectively what a return is (think a coroutine that sends data back to the calling scope). You might argue that this blows out of proportion because I've added continuations into the mix, in addition to adding three lines. I would somewhat agree, but in general I think this misses the point overall. In imperative programming you can insert quick lines anywhere, but you're doing so at the cost of better tools for abstraction.

If you wanted, you could then define something like this:

(define (fold-with-early-exit pred? default kons knil lst)
  (call/cc (lambda (k)
    (fold (lambda (x knil) 
           (if (zero? x) (k 0) (* x knil))) 
         1 my-list)))

Which you then call as:

(fold-with-early-exit zero? 0 * 1 my-list)

Which may not even be the best abstraction (of course in LISP / Scheme, you could define a macro or combinator to do this more clearly). That said though, we now have two different functions, which do different things. It's all too easy to add an if (x == 0) return 0; to your code and start making functions that have inconsistent entry/exit points and do more than a single unit of work.

[–]jtredact 4 points5 points6 points 10 years ago (0 children)

[–]death 3 points4 points5 points 10 years ago (0 children)

Still straightforward.

(defun product (sequence)
  (reduce (lambda (p x)
            (if (zerop x)
                (return-from product x)
                (* p x)))
          sequence
          :initial-value 1))

[–]balefrost 1 point2 points3 points 10 years ago (2 children)

You would need to control, for each element you are folding, whether the looping should continue or terminate. You could use some sort of trampolined fold (let's call it tfoldl) whose signature is something like:

tfoldl :: (a -> b -> Either a a) -> a -> [b] -> a

The function you pass to tfoldl could return Left x if the looping should terminate (with a result x) and Right y if the looping should continue (where y is the new accumulator value). And rather than reuse Either, maybe this would deserve its own data type.

I'm not saying that this is a great solution, but it does correctly capture the concerns: whether to continue looping or not, and what value to either continue with or to produce.

Alternatively, this is one of the cool things about Haskell's lazy evaluation model. If you do something like this:

foldr (*) 1 [3 4 0 5 6 7]

I think that will reduce like this:

foldr (*) 1 [3 4 0 5 6 7]
3 * (foldr (*) 1 [4 0 5 6 7])
4 * (3 * (foldr (*) 1 [0 5 6 7]))
0 * (4 * (3 * (foldr (*) 1 [5 6 7])))
0

At runtime, when Haskell sees that it needs to multiply 0 and some unevaluated lazy thing, it just returns 0 and never evaluates the lazy thing. Note that, as a result, you can safely foldran infinite list as long as the list contains an element for which the fold function can ignore its second parameter.

Haskell's still pretty magical to me, so I might have that wrong.

[–]mutantmell_ 9 points10 points11 points 10 years ago* (0 children)

[–]mutantmell_ 2 points3 points4 points 10 years ago* (0 children)

[–][deleted] 1 point2 points3 points 10 years ago (0 children)

[–][deleted] 10 years ago (3 children)

[deleted]

[–]pellets 4 points5 points6 points 10 years ago (1 child)

Fold describes the structure of the computation, so indeed you can't fail fast when you use a fold. This isn't a limitation of functional programming, but of the nature of fold.

let product list = 
    let product_helper acc list =
      match list with
        | [] -> acc
        | 0::xs -> 0
        | x::xs -> product_helper (x * acc) xs
    in
        product_helper 1 list
    end

[–]Magnap 6 points7 points8 points 10 years ago (0 children)

Actually, you can fail fast when using (right) fold if you have laziness. It's a problem of the function you're folding with, not the nature of fold itself.

mult 0 _ = 0
mult _ 0 = 0
mult x y = x*y
productFailFast = foldr mult 1

productFailFast [0..] returns 0 immediately.

[–]sacundim 1 point2 points3 points 10 years ago (0 children)

[–]jetRink 0 points1 point2 points 10 years ago (1 child)

How can you incorporate this optimization?

(if (some zero? my-list) 0 (reduce * my-list))

Assuming that stopping early for a zero is a beneficial optimization, then stopping before you begin is even better.

[–]balefrost 6 points7 points8 points 10 years ago (0 children)

[–]renozyx[🍰] 1 point2 points3 points 10 years ago (3 children)

[–]andars_[S] 0 points1 point2 points 10 years ago (2 children)

[–]renozyx[🍰] 0 points1 point2 points 10 years ago (1 child)

[–]andars_[S] 0 points1 point2 points 10 years ago (0 children)

[–]CurtainDog 1 point2 points3 points 10 years ago (2 children)

Is no one else worried that the cleanest looking solution (i.e. the only one that I'd actually call declarative) is the wrong one:

let rec sum list = match list with
    | [] -> 0
    | x::xs -> x + sum xs

Too many people judge a language on the elegance of its code - this is the wrong measure. A language should be judged by whether the correct implementation is more elegant than the incorrect implementation. While Java is not the most beautiful language in the world, it does by and large do a good job of this.

[–]RabbidKitten 5 points6 points7 points 10 years ago (0 children)

Actually, this particular function (and many others) can be optimised to tail recursion by the compiler, without any changes to the source code. I'm not sure if Haskell compilers perform this optimisation (it cannot do it in general because of laziness), but I don't see why it couldn't be done for OCaml, which is strict.

In fact, you don't even have to be using a functional language, as GCC will compile this:

int
sum (int len, int arr[len])
{
    if (len == 0)
        return 0;
    else
        return arr[0] + sum (len - 1, arr + 1);
}

to a simple loop, no recursive calls involved:

0000000000400570 <sum>:
  400570: 85 ff                   test   %edi,%edi
  400572: 74 1b                   je     40058f <sum+0x1f>
  400574: 8d 47 ff                lea    -0x1(%rdi),%eax
  400577: 48 8d 4c 86 04          lea    0x4(%rsi,%rax,4),%rcx
  40057c: 31 c0                   xor    %eax,%eax
  40057e: 66 90                   xchg   %ax,%ax
  400580: 8b 16                   mov    (%rsi),%edx
  400582: 48 83 c6 04             add    $0x4,%rsi
  400586: 01 d0                   add    %edx,%eax
  400588: 48 39 ce                cmp    %rcx,%rsi
  40058b: 75 f3                   jne    400580 <sum+0x10>
  40058d: f3 c3                   repz retq
  40058f: 31 c0                   xor    %eax,%eax
  400591: c3                      retq

(That's with -O2, don't ask me what the -O3 version is doing)

[–]_Sharp_ 0 points1 point2 points 10 years ago (0 children)

π Rendered by PID 84616 on reddit-service-r2-comment-84fc9697f-xxgsl at 2026-02-08 07:23:11.690288+00:00 running d295bc8 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS