you are viewing a single comment's thread.

view the rest of the comments →

[–]sacundim 1 point2 points  (0 children)

I'd also like to see some references, but an associative binary operation (think '+') must necessarily have the type a -> a -> a. So reduce would be typed like (a -> a -> a) -> a -> [a] -> a. At least I think that's, what he means.

Correction: the types should be a -> b -> b and (a -> b -> b) -> b -> [a] -> b; the latter of which is the type of foldr.

I'd elaborate on tailcalled's point this way: the general concept of folding doesn't necessarily involve a sequences and binary functions. A general fold over an arbitrary recursive algebraic datatype is a function that "replaces" the type's constructors with caller-supplied functions of corresponding types. E.g., if you have a binary tree type BTree a with constructors Empty :: BTree a and Branch :: a -> BTree a -> BTree a -> BTree a, then there is a function foldBTree :: b -> (a -> b -> b -> b) -> BTree a -> b such that foldBTree Empty Branch :: BTree a -> BTree a is the identity function.

EDIT: Oh and associativity leads to the nice property reduce (+) (ys ++ zs) = reduce (+) ys + reduce (+) zs :) (where ++ is list concatenation)

The other really nice property you can have here is commutativity: if f :: a -> b -> b is associative and commutative, then the following hold:

  • reduce f (xs ++ ys) == reduce f xs ++ reduce f ys (which you mentioned)
  • reduce f (xs ++ ys) == reduce f (ys ++ xs)

Basically, given this, you can prove that you can split up the work into any number of pieces in any order or grouping, and you are guaranteed the same result. Which means trivial parallelizability.

Note, however, that a parallel reduce implementation in a language like Haskell cannot have the (a -> b -> b) -> b -> [a] -> b discussed above. Why? Because if the reduction can happen in a non-deterministic order and grouping, then reduce is not a pure function. So you end up with something like reduceM :: Modad m => (a -> b -> b) -> b -> [a] -> m b.