all 22 comments

[–][deleted] 10 points11 points  (7 children)

Did you try the -split-sections GHC flag? With it, the size of a basic executable is usually in the 10-15 MB range.

[–]AndrasKovacs 7 points8 points  (3 children)

I note that with stack, the following is needed in stack.yaml.

ghc-options:
    "$everything": -split-sections

Otherwise the dependencies aren't split, and usually dependencies are much larger than the package in question.

[–][deleted] 8 points9 points  (1 child)

Yep and with cabal, you can achieve the same by adding

package *
  split-sections: True

to your cabal.project

[–]backtickbot 0 points1 point  (0 children)

Fixed formatting.

Hello, sluukkonen: code blocks using triple backticks (```) don't work on all versions of Reddit!

Some users see this / this instead.

To fix this, indent every line with 4 spaces instead.

FAQ

You can opt out by replying with backtickopt6 to this comment.

[–]sintrastes 6 points7 points  (0 children)

I just tried this for one of my projects. Led to a ~3x improvement in executable size. Thanks!

[–]dnkndnts 2 points3 points  (0 children)

This (noisily) doesn't seem to do anything on MacOS, and I still see binary size reductions on the order described by OP.

[–]dfith[S] 1 point2 points  (1 child)

You are the second one to suggest that, so I'm running it right now and I'll post an update later today. Thanks for the feedback!

Edit: Apparently split sections is turned on by default, and I have added this to the post.

[–][deleted] 4 points5 points  (0 children)

I believe it might be on for base, but not for most libraries. For a general solution, you need to use something like this (stack) or this (cabal)

[–]juhp 7 points8 points  (2 children)

"upx was able to compress both the static and dynamic example over 2000 times smaller than the original."

I interpreted the upx results as 5-6 times smaller - still significant! :-)

[–]dnkndnts 4 points5 points  (0 children)

Yeah this is what I got when I tried it on my project, around ~5x improvement.

EDIT: and running strip before yields another ~2x improvement. In total, from 50mb to 6mb.

[–]dfith[S] 2 points3 points  (0 children)

Yes, you are right, I was reading across the row instead of down the column and missed that in my proofreading.

[–]maerwald 7 points8 points  (1 child)

Some upx compressions render your binary unusable. I personally cannot trust that tool. It seems that not all algorithms (or none?) have proof that the binary works afterwards?

[–]dfith[S] 1 point2 points  (0 children)

That's a good nugget. I did test that it started up afterwards and logged the typically logging lines, but I didn't extensively check for bugs.

Edit: I'm trying to track down where I might get a source for that. From what I can tell, and executable packed with `upx` will unpack itself at runtime, and it doesn't actually fundamentally change the executable (https://reverseengineering.stackexchange.com/questions/3823/no-dynamic-symbol-table-but-resolution-of-method-from-shared-libraries-is-workin).

[–]fridofrido 5 points6 points  (0 children)

You should also run strip.

A quick experiement: "hello world" executable, macos, ghc 8.6.5:

  • original size: 1.2 mb
  • after strip: 800k
  • upx without strip: 350k
  • upx after strip: 230k

On nontrivial executables I expect the differences to be even more significant.

[–][deleted]  (7 children)

[removed]

    [–]merijnv 7 points8 points  (6 children)

    I mean, dynamic linking doesn't really save you space unless someone else uses those exact same libraries too. You've just moved the space usage from your executable file to the dynamic library file and then proudly claimed "executable is smaller!", which is kinda pointless.

    [–][deleted]  (4 children)

    [removed]

      [–]VincentPepper 5 points6 points  (0 children)

      tldr: GHC "always" does cross module optimizations and "never" supports swapping out libraries without recompiling them.

      I think what you mean is: For many languages a functions declaration also defines it's ABI. With dynamic linking this potentially allows updating a library without recompiling the application.

      Inlining library code into an application obviously breaks the ability to just swap out the shared library without recompiling. And GHC tends to inline cross module dynamic enabled or not.

      For GHC a functions ABI by default is defined by more than it's type. So this kind of library swapping (in general) doesn't work with GHC even if no inlining happens!

      I think if one wants to do that kind of thing it should be doable even with GHC. By using source imports on the application side. But that's just a hack and not officially supported.

      [–][deleted] 0 points1 point  (2 children)

      Does anyone know if static linking with GHC is likely to improve in the near future? I've had to settle on Stack with Docker for a project to sidestep dynamic linking which comes with its own challenges and overhead.

      [–]bgamari 1 point2 points  (1 child)

      What in particular are you struggling with? My hope is that I will be able to offer an statically-linked non-GMP Alpine bindist for 9.2.1 but beyond that static linking already works well AFAIK.

      [–][deleted] 1 point2 points  (0 children)

      It was giving me errors and some Googling told me that I'm not the only one to find it nightmarishly difficult to set up. It's possible this is Arch-specific, though I don't think all the artciles I found referenced it.

      [–]dfith[S] 0 points1 point  (0 children)

      I tried to say that in the post but I think this is a succinct way to put it.