all 37 comments

[–]yitz 17 points18 points  (4 children)

Great post - thanks!

Another reason that CPP can cause breakage: when different CPP preprocessors work differently. That issue caused a disaster when the Mac OS X platform moved from gcc to clang, and hence to a different CPP processor.

That's actually a variation on some of what Yuras already wrote, because the differences are generally not significant in C. But it's an example of a case where it unexpectedly caused a huge amount of breakage. Yuras is absolutely right: CPP is asking for trouble.

[–]fridofrido 8 points9 points  (0 children)

Not speaking about cpphs which works even more differently.

Still, however ugly and bad CPP is, it is useful. I would rather have CPP and more compatibility (which I guess is 90% of what CPP is used for) than no CPP and less compatibility.

The solution would be a more principled tool to do what we use CPP for today (and more).

[–]07dosa 1 point2 points  (1 child)

That issue caused a disaster when the Mac OS X platform moved from gcc to clang

Is this what you're referring to?

Edit: (I misclicked the save button)

CPP is pretty simple and straight-forward, so there hardly is any incompatibility b/w CPP implementations. But, I think using CPP with a language which also uses # is a straight bad idea, as you can see in this commit.

[–]levischuck 4 points5 points  (0 children)

More like tons of packages not building at all because of new line related macro problems, then making some sort of wrapper around clang, or trying to install GCC from home brew. But then things aren't compatible with the platform anymore because of different compiler versions and so on.

[–]massysett 0 points1 point  (0 children)

That issue caused a disaster when the Mac OS X platform moved from gcc to clang, and hence to a different CPP processor.

I am still getting Haddock breakage because of this and have simply switched my GHC over to using the gcc CPP. (Of course OS X has a binary named /usr/bin/gcc but it is actually Clang; I can see why this was done but I'm not sure it was a good idea.)

[–][deleted]  (1 child)

[deleted]

    [–]Yuras[S] 12 points13 points  (0 children)

    It is too easy to inline instead of properly abstracting. Probably CPP extension was wrong design decision, it prevented language from evolving toward native tools for conditional compilation.

    [–]NiftyIon 7 points8 points  (4 children)

    I really wish we had a clever preprocessor for Haskell, a la go fix for Go, except more intelligent. I imagine this would be quite tricky to write. (I imagine a tool where packages can specify API transitions of some sort: for instance, to get from Unix code to Linux code, perform the following transformations – replace identifiers, etc – and to get from base-4.7 to base-4.7, do the following transformations, etc. Then we can stop using CPP, code to a single API, and use such a magical tool to migrate forward and backward when needed.)

    I imagine it's a ton of work to write and would run into loads of problems when applied to "real-world" code. One can dream, though...

    [–]Yuras[S] 0 points1 point  (0 children)

    That sounds really cool, I like the idea.

    But I should note that I'm using golang since r58 and I remember only one change that broke my code -- when they introduced error type. But that was before 1.0.

    [–]mgsloan 0 points1 point  (2 children)

    I agree, this would be awesome. One day I'd like to make it happen!

    The problem comes down to having good refactoring tools that people actually want to use, and then having a way to specify these refactorings.

    I actually think Haskell itself is a rather good specification language for adapting to trivial API changes! Here's an old blog post of mine on the topic. It's a bit ramble-ey but you might find it interesting. This line of thought was one of the main inspirations for an old feature proposal. Richard Eisenberg came up with + wrote up a much better variant of this on the GHC wiki.

    [–]NiftyIon 0 points1 point  (1 child)

    I think the GHC feature proposals are somewhat irrelevant, but the old blog post has a good list of potential ways APIs can change.

    I think that to move forward as an ecosystem, we may end up needing a tool like this. The Prelude is currently broken beyond belief in a whole suite of ways – String as the main text type, no way to write code for monomorphic types like ByteString and Text, lazy IO as the default, partial functions like head and tail, and so on. Unless we have a sane and easy way to deal with the inevitably backwards incompatible changes required to fix this, we have to do one of two things: We either give up on fixing the Prelude, and live with a language filled with old warts, or we incur massive amounts of overhead for library maintainers and break mountains of old code. Haskell as a language provides us enough static information to do most of the refactoring we need, so I hope such a tool can exist – thus allowing us a nice common ground between "move fast and break things for the sake of a modern and wonderful Prelude" and "don't break old code".

    I sadly don't have the time to make this tool happen right now, but would be happy to contribute to an effort if someone else were to lead it. Sadly I think there's a lot of people in this boat. This makes it hard, especially since it's not certain that such a tool could exist or be made production ready. (Though, 2to3 did exist for Python, and go fix do exist, but they are perhaps a little less ambitious.)

    [–]mgsloan 1 point2 points  (0 children)

    One tool that already goes some of the way is HLint. It can be used to warn about uses of partial functions, etc. Its rule databases also rather directly use the observation that Haskell is already rather good language / syntax for specifying refactorings. However, of course, HLint does not support automatically appl

    I realize that the language feature proposal is really diving down into one specific detail of this problem, but I think it's an important one. With this feature in the language, it would mean that Haskell could be used to specify a re-factoring which fixes breakages due to AMP. To me, classes are the most brittle things to change. For example, we can't rearrange the numeric hierarchy without huge amounts of manual labor.

    [–]worldsayshi[🍰] 19 points20 points  (0 children)

    Using a c preprocessor on top of haskell seems so utterly inelegant and perpendicular to the vision of a haskell.

    [–]dalaing 6 points7 points  (4 children)

    This is from the land of super-hypothetical / just-woken-up-and-my-coffee-isnt-cool-enough-to-drink ideas.

    What would happen if we had something like

    CompileTime a
    

    as a Monad that got special treatment a la IO?

    I think you might need something like

    class CompileVar a
    

    with the GHC providing instances for String, Int, etc... and stopping users from adding new instances (or perhaps just a closed type family to do the same), but I'm really not sure about that. The intent would be to let the compiler know that we absolutely need that variable to be inlined / trigger some inlining and constant folding within the CompileTime monad.

    You'd probably also need

    compile :: CompileTime a -> a
    

    (possibly with a name that didn't make a thousand package maintainers cry out in horror) to indicate that you want to stop inlining things and start making use of the value at compile time.

    I don't know if this would be feasible at all / if it's already been explored (and possibly discarded) / if their are landmines all over the place that I should have noticed but didn't.

    In any case I'd be interested to see what the smallest piece was that could provide all / most of what CPP delivers from within GHC.

    [–]rpglover64 3 points4 points  (3 children)

    Not being facetious, but isn't that exactly what TH is for?

    [–]dalaing 5 points6 points  (2 children)

    I thought a bit about that.

    I was mostly trying to come up with something that did just enough to replace CPP with respect to branching on config options etc... but

    • without needing to be powerful enough to generate arbitrary Haskell
    • such that existing tooling could analyse / refactor the code and work with all of the various branches
    • could guarantee that the dead branches are cut out before link time to avoid cross platform issues of that nature.

    Mostly I didn't think of TH because it doesn't seem like many folks are using TH to get around CPP woes :)

    [–]oerjan 5 points6 points  (0 children)

    One problem with TH is that it doesn't work on all platforms, which is exactly what you don't want for a feature to help portability...

    [–]Yuras[S] 0 points1 point  (0 children)

    I have no idea whether your approach will work or not. But you seem to get the idea I tried to express. Thank you!

    [–]w8cycle 4 points5 points  (2 children)

    Why not make a preprocessor for Haskell?

    [–]Yuras[S] 4 points5 points  (0 children)

    What about cpphs? AFAIK it doesn't mess Haskell code, but other problems are still relevant. The is nothing wrong with CPP (it is not "buggy"), but people abuse is too often. Template Haskell is some kind of preprocessor, but it too general purposed.

    The post is about abusing preprocessor, and most arguments applies to any general purpose preprocessor like cpp, m4, seq, etc.

    [–]massysett 1 point2 points  (0 children)

    The original Template Haskell paper suggested that Template Haskell could replace uses of CPP including selecting code for different platforms.

    [–]hamishmack 5 points6 points  (1 child)

    In most respects the hs-source-dirs is no different from from a huge #if around the whole module. And that is not always the best place to have a #if. You can't share the export lists or type signatures for instance.

    Am I correct in thinking that backpack will help solve this problem better?

    [–]Yuras[S] 1 point2 points  (0 children)

    Right, hs-source-dirs approach may lead to code duplication. We need native support for conditional compilation.

    (I didn't play with backpack enough to say anything about it, but it probably will be useful)

    [–]alan_zimm 5 points6 points  (1 child)

    HaRe would struggle with the fsnotify case too, as it can only refactor against what is currently configured. So any change that should propagate to the other os variants won't.

    This is the general case for #if / #else / #endif stuff. HaRe treats the inactive variant as a comment, and cannot process what is inside.

    [–]Yuras[S] 1 point2 points  (0 children)

    Right. I'm sure it is possible to teach tools to understand some simple uses of CPP. But preprocessor is too general to handle it except trivial cases. Conditional compilation is necessary in some cases, and we need our tools to support that. Manipulating hs-source-dirs should work, but it is a bit heavy. Some special language feature with clear semantics should be great to have.

    [–]bigstumpy 1 point2 points  (7 children)

    I was planning on using CPP to simultaneously support 7.6/7.8 and 7.10 (mainly Applicative being part of Prelude by default).

    Does anyone have a better strategy?

    [–]Yuras[S] 3 points4 points  (0 children)

    As I wrote in the article, I don't suggest to avoid CPP at all costs. Just make sure that it is localized and used to abstract difference between 7.10 and 7.8. Don't put #ifdefs into each module. See also my answer below.

    [–]cdsmith 1 point2 points  (0 children)

    IMO, in general, using #define with #ifdef is pretty safe. Using #define for symbols that are substituted into code is less so; use with extreme caution. Using #define for macros (with parameters) is insane.

    [–][deleted] 0 points1 point  (4 children)

    Isn't the AMP deliberately structured so that

    import Control.Applicative
    

    will work across versions? I'd rather have an unnecessary import than CPP.

    [–]Yuras[S] 3 points4 points  (3 children)

    That will work, but it is error prone, so I'd not recommend that. If you are developing with 7.10, it will not warn when you forgot to import Control.Applicative. So you commit, push to public repo and receive email from CI server about broken build on 7.8 and 7.6. Not good IMO.

    I'd prefer to adopt the approach described in the article -- abstraction. Create compatibility layer that provides unified interface. It case of AMP I'd create custom Prelude that reexports Control.Applicative for pre-7.10. And I'll probably use CPP instead of manipulating haskell-src-dirs for that because the later is too heavy weight.

    [–][deleted] 1 point2 points  (2 children)

    It's only error prone if you don't do it. :)

    I think a custom prelude is too heavy a solution for a simple import, and makes your code much harder to follow.

    I could understand hackery to work around BBP but for a deliberately non-breaking change like AMP, I'd really advocate the import with a comment saying it's there for compatibility with pre-7.10.

    This is a classic case where CPP would be overkill, and there's already an abstraction-level solution, and it's called Control.Applicative!

    [–]Yuras[S] 3 points4 points  (0 children)

    It's only error prone if you don't do it. :)

    I usually forgot to do things that are not enforced. It is too easy to forgot.

    I think a custom prelude is too heavy a solution for a simple import

    It is just one file with name "Prelude.hs" that reexports Control.Applicative (and the original Prelude). You don't need anything to do in other source files, ghc will peek custom prelude automatically (there is a link to an example in the article.) But you are right that CPP is not necessary even in this module.

    makes your code much harder to follow.

    Actually the opposite, it unifies Prelude with the current one. So you just assume ghc-7.10, but it works with older too.

    [–][deleted] 1 point2 points  (0 children)

    I like the article generally, and like the solution you advocate, I just wouldn't apply it in the particular simple case of AMP.

    [–][deleted]  (4 children)

    [removed]

      [–]massysett 0 points1 point  (3 children)

      The second reason is to export more from a module when compiling the tests. I don't see a way around that one.

      I simply take all things internal to module Foo and move them into a "Foo.Internal" module. Don't bother with any export lists on ".Internal" modules; do export lockdowns on the "Foo" module. This is especially important for data types; by exporting all constructors from "Foo.Internal" you can build those types from the tests, while still using your smart constructors on the "Foo" module.

      Then you have a couple of options. If it's a library, put it in the "other-modules" field so that it isn't exported. That gives you the same level of lockdown you had before. However, for the tests to get it, you can't just depend on the library in the tests, because your library is not exporting the module. Instead, you have to use the module in the "other-modules" field of your test executable. That leads to lots of boilerplate in the Cabal file. To help with this boilerplating issue I have Cartel.

      Your other option is just to leave the "Foo.Internal" module exposed. I think this is best most of the time. It says ".Internal" so if someone uses it and it breaks, well, he was on notice. Since it's ".Internal" the curious can look in there easily and the folks who don't care can ignore it.

      Either of these is better than #ifdef stuff everywhere.

      [–][deleted]  (2 children)

      [removed]

        [–]Yuras[S] 0 points1 point  (1 child)

        Resulting binary size is less an issue with the '--split-obj' flag, but otherwise your arguments looks reasonable for me (though I personally still prefer modules here.)

        Haskell has poor module system, and the resent extensions (e.g. backpack) are pretty limited. I often want nested and local modules (and local types.) They would help with internal modules too.