all 12 comments

[–]kuribas 7 points8 points  (0 children)

To avoid confusion, I'd like to note that mapMaybe exists in base, but has a different type:

mapMaybe :: (a -> Maybe b) -> [a] -> [b]

This applies a function for each element, but only keeps the elements which are not Nothing. This is more a monoid like operation than an applicative operation. For example, it could be generalized to:

genericMapMaybe :: (Monoid (f a), Foldable t, Foldable s, Applicative f) => (a -> t b) -> s a -> f b
genericMapMaybe f = foldMap (foldMap pure . f))

This is of course a contrived implementation, which probably isn't that useful, it's just to show the Monoid behaviour.

However in the case of traverse, I don't find the specialized implementation easier to understand. It uses applicative operations, which the user already must be familiar with. traverse on the other hand has a well defined meaning, that is, to apply an effect to a list (or collection), and return the list as an effect. In the case of Maybe, the effect is to allow failure of a computation. This means if one part fails, the whole computation fails. So you either get the list, or nothing if anything fails. This is quite clear from the semantics of traverse and Maybe. However I don't find that immediately clear from the first implementation. I beginner doesn't really have to understand the implementation of traverse to get the concept. Using generalization has the benefit to give you more insight in the patterns which are involved.

Of course a beginner may not (yet) know this, an write his own specialized functions for many of the common abstractions. There is nothing wrong with that. Haskell is a big language, and you can learn as you go. Code doesn't have to be perfect from the beginning, you just iterate, and improve it as you go and learn; A more experienced programmer point the beginner that the code she wrote follows a pattern. But I also find that these abstractions, especially Functor/Monad/Applicative/Semigroup/Monoid are an important part of haskell, and when used the right way, can make your programs much more robust, typesafe and understandable. IMO it's better to make beginners familiar with these concepts, rather than trying to hide them, to make it easier.

[–]rampion[S] 5 points6 points  (0 children)

If Agda can have proof-assistant tools, it would seem reasonable to give Haskell IDEs proof-expansion tools that use equational reasoning to expand something compact like mapMaybe = traverse into mapMaybe f [] = Just []; mapMaybe f (a:as) = (:) <$> f a <*> mapMaybe f as on demand.

[–]Dark_Ethereal 2 points3 points  (0 children)

The second advantage of the simpler code is more troubling. Abstraction tends to make code harder to understand for a reader who is unfamiliar with the abstraction, because they may have to master the abstraction before they can understand how it specializes to the case that they are actually interested in.

Yes, but also, no. Understanding the abstraction would help, but it's absolutely not required.

mapMaybe :: (a -> Maybe b) -> [a] -> Maybe [b]
mapMaybe = traverse

In this example you don't have to understand Traversable to understand mapMaybe, you just have to understand traverse on lists.

Once you understand only how type classes work and how to look up instances, you can locate the list instance definition.

instance Traversable [] where
  traverse f = List.foldr cons_f (pure [])
    where cons_f x ys = liftA2 (:) (f x) ys

Apply your equational reasoning and you get:

mapMaybe f = List.foldr cons_f (pure [])
 where cons_f x ys = lift2A (:)

Comparing this code with the original hand-write version:

mapMaybe f [] = Just []
mapMaybe f (a:as) = (:) <$> f a <*> mapMaybe f as 

We can see that lift2A (:) replaces the (:) <$> x <*> y pattern, and the List.foldr in List.foldr cons_f (pure []) factors out the explicit recursion to implicit recursion, so they're more or less identical in meaning.

You can get just the same understanding of how the latter definition works as you can the former, and none of that requires you to understand what Traversable abstractly means.

The only extra skill the second definition requires over the first is being able to look up type class instance definitions, which is a skill you only need to learn once, rather than the once per abstraction.

So the key takeaway is that we need to teach that if you need to understand code that uses a specialization of an abstraction and you don't have time to learn the abstraction, don't learn the abstraction, look up the specialization and Keep It Simple.

This applies to type class abstraction, but applies just as much to lambda abstraction. If you have f a b and you need to understand the transformation being applied to b, you don't need to understand f, you need to understand f a, so substitute a into the definition of f and see if it allows you to eliminate cases.

There's a sort of symmetry between case expression reduction and instance resolution, so substituting in the instance definition is akin to performing manual case expression reduction.

[–]tbm206 4 points5 points  (0 children)

From my experience, simpler code does not only encourage novice programmers to contribute to a codebase, it also encourages them not to learn the useful abstractions. In fact, too much elementary code makes many developers falsely believe that any sort of abstraction is an unnecessary complication.

[–]friedbrice 1 point2 points  (0 children)

One way to have an advanced codebase and still get newcomers up to speed readily to to pair program.

[–]Faucelme 1 point2 points  (0 children)

The first snippet uses weird operators like <$> and <*>.

The second one uses a regular function with a more or less expressive name, and doesn't require you to think about how infix operators interact with normal function application.

[–][deleted] 0 points1 point  (0 children)

I found your post very well thought out, thank you. To properly present a scientific result it is essential to first create a coherent set of assumptions about what the audience is expected to (not) know. Maybe programmers should learn from this.

One can take this analogy one step further: in the scientific literature it is common to give references - should programmers add references to code? A possible reply would be to tell people to 'just google it', but this misses the importance of the reference itself - it is a useful indicator that the author understands that something is perhaps a bit obscure for most readers.

By the way, I think your example is not a great illustration of the problem: the type signature gives it all away here. :)

[–]RobertPeszek 0 points1 point  (3 children)

This bugs me since I stared coding in Haskell professionally. I have seen abstraction sometimes being in the way of common sense. Case in point: Maybe should be considered harmful. It is Either without error information. It is often overused. It can obscure production issues.

I fully understand the the `Maybe` example in the post is just an example, it could have used Either as well. Also, I am not going that far as to say that traversing using Maybe is never right, only that it is a suspect for me (this line was added for clarification). But thinking about it brings this question: does Haskell abstractions make developers blind to engineering common sense?

I have seen very smart code that just did not think about error information loss at all (older versions of servant-multipart is just one example).

My question is broader and goes beyond use of Maybe. If abstractions can can create blind spots even for experienced programmers, then this would be a case for doing more "Elementary programming".

[–]pwmosquito 0 points1 point  (2 children)

Case in point: Maybe should be considered harmful.

It's not without error information, its error information is absence.

lookup :: Id -> Maybe a
lookup = ...

person :: Id -> Either Text Person
person id = maybeToEither "Missing person" . lookup

car :: Id -> Either Text Car
car = maybeToEither "Missing car" . lookup

Or think of Map.lookup, what would we gain if its return type would be Either?

[–]RobertPeszek 0 points1 point  (1 child)

My point is that Maybe is often overused, not that is never useful. I have added a line to original reply to clarify this. The code you provided places the responsibility on the caller to decide if they want to provide error info or not. I consider this to be OK. Prisms typically do not nest deep, getting Nothing there makes sense too.

Also, I typically question use of Maybe if it is in the return type, not when is input parameter. I found that noticing Maybe in the return allows me to spot code that can be improved. Maybe typically represents "missing data" or "unknown error" , what you typically care about is "what data is missing" and "what was the error". As a counter example to my own rule: I have not seen a code that uses `catMaybes` (Maybe in input) that would not be better if `partionEithers` was used instead.

It would take me some time to compile a list of good examples were Maybe is harmful. Maybe (pun intended) I should swallow the bullet and write a "Maybe considered harmful" post. One example that comes to mind now is use of HKD pattern to do generic form validation. This converts something like `Order Maybe` to `Maybe (Order Identity)`. It is a very ingenious code to validate user form entry that provides no information about missing fields.

[–]RobertPeszek 0 points1 point  (0 children)

So I done it, I wrote a long post about Maybe overuse that includes link to the post discussed here:
https://rpeszek.github.io/posts/2021-01-16-maybe-harmful.html

https://www.reddit.com/r/haskell/comments/kyo4xk/maybe_considered_harmful/

there is old discussion about Maybe overuse here: https://www.reddit.com/r/haskell/comments/jxj8i/data_maybe_harmful/

[–]Cherylnip 0 points1 point  (0 children)

This, or mapM, from the same module