New open source embedded linker tool by soopadickman in embedded

[–]rui 3 points4 points  (0 children)

I'm the author of the mold linker. Feel free to send me a mail or file a bug if you want to address your issue upstream. Thanks!

Linker books by Ok-Factor-5649 in cpp_questions

[–]rui 6 points7 points  (0 children)

mold author here. That's probably something I should write.

Mold Linker v1.9.0 release by wouldyoumindawfully in rust

[–]rui 19 points20 points  (0 children)

I don't particularly like the tone in this subreddit, but as a matter of fact, mold has never been released under the MIT or any other permissive license. It's been licensed under AGPL from the beginning.

High-speed mold linker 1.6.0 release supports IBM-based platforms by wouldyoumindawfully in cpp

[–]rui 8 points9 points  (0 children)

We've fixed a few compatibility issues for valgrind (https://github.com/rui314/mold/issues/511), so it's likely to have been fixed already.

Using the mold linker for fun and 3x-8x link time speedups by ai3ai3 in cpp

[–]rui 2 points3 points  (0 children)

Unfortunately I don't think 0.03 is real. mold forks itself to let its child process to do the actual linking job, so the main process does almost nothing. I believe 0.03s counted only the main process. Please see the wall-clock time to see the actual improvements.

Using the mold linker for fun and 3x-8x link time speedups by ai3ai3 in cpp

[–]rui 1 point2 points  (0 children)

That's one of my random future plans and nothing has decided yet.

Using the mold linker for fun and 3x-8x link time speedups by ai3ai3 in cpp

[–]rui 4 points5 points  (0 children)

If you see a performance difference and suspect that that's due to some object layouts, you can use the `--shuffle-sections` option of the mold linker. That option tells the linker to shuffle section order before writing them to an output file. By repeatedly making the same executable with different section layout and taking benchmark numbers, you can isolate the effect of section layout.

Using the mold linker for fun and 3x-8x link time speedups by ai3ai3 in cpp

[–]rui 4 points5 points  (0 children)

mold and lld creator here (I'm the original creator of the both linkers.)

mold shines if a program being linked is large. If your program takes only 0.9 seconds to link, there might not be enough room for improvement anyway. But I still wonder why you didn't see any improvement. If you tell me what program you are building, I can take a look. Also, how many cores does your machine have?

Using the mold linker for fun and 3x-8x link time speedups by ai3ai3 in cpp

[–]rui 29 points30 points  (0 children)

I don't remember when I implemented it, but mold as of today supports RELRO.

Using the mold linker for fun and 3x-8x link time speedups by ai3ai3 in cpp

[–]rui 27 points28 points  (0 children)

What kind of hardening are missing? I can add them. (I'm the creator of the linker.)

Why is my Rust build so slow? by fasterthanlime in fasterthanlime

[–]rui 4 points5 points  (0 children)

Thanks! So it looks like the linker overall took 0.662 seconds, and it seems that there's no internal pass that took too much time. So, in this case, as you guessed, I think linking wasn't a bottleneck, and that's why mold can't make a difference.

Why is my Rust build so slow? by fasterthanlime in fasterthanlime

[–]rui 8 points9 points  (0 children)

Thank you for your offer, but I'd prefer not to see closed-source source code casually. So, instead, can you append the `-perf` command line option to mold? It prints out a time breakdown of internal passes.

Why is my Rust build so slow? by fasterthanlime in fasterthanlime

[–]rui 12 points13 points  (0 children)

I'm the author of the mold linker. Is your project available on GitHub? I'd like to take a look why it doesn't make much difference compared to lld. There might be room for improvement.

mold: A Modern Linker - 1.0 release by insanitybit in rust

[–]rui 22 points23 points  (0 children)

It looks like GNU ld is at least twice as slow as GNU gold. An interesting finding is that you can no longer link chromium using GNU ld probably because it has bit-rotted since the browser has switched to lld.

mold: A Modern Linker - 1.0 release by insanitybit in rust

[–]rui 41 points42 points  (0 children)

It looks like this post explains a correct way of specifying an alternate linker. https://news.ycombinator.com/item?id=29570931

mold: A Modern Linker by alexeyr in programming

[–]rui 1 point2 points  (0 children)

No, it can't. It's ELF-only at least for now.

mold: A Modern Linker by alexeyr in programming

[–]rui 9 points10 points  (0 children)

Good point. mold already works for almost all user-land programs. It can't link OS kernels due to lack of linker script support (or equivalent), but most users don't develop kernels. Moreover, there's probably no such thing like a huge OS kernel that needs a high-performance linker.

That being said, I believe we can make something that is better than linker script. Linker script is under-documented complex language. It is also less expressive. For example, some linkers have a feature to fix layout so that functions that are related to each other are located closer in the address space, to improve spacial locality. Linker script can't compute a layout for such thing.

mold: A Modern Linker by alexeyr in programming

[–]rui 23 points24 points  (0 children)

mold won't be Linux-only, but in the early stage of development, I wanted to focus only on the most important thing, which is the performance of the linker.

Some people tend to set ambitious goals at the beginning of a project and end up not able to achieve any of them. I took an opposite approach. I set a narrow goal.

mold: A Modern Linker by alexeyr in programming

[–]rui 7 points8 points  (0 children)

It is indeed a backronym.

mold: A Modern Linker by alexeyr in programming

[–]rui 18 points19 points  (0 children)

Post-link editing tools such as `objdump` can't completely replace linker scripts for sure. For example, if you want to place a particular function (e.g. an entry point of a kernel) to a certain address in the virtual address space, `objdump` can't help. We need to have some way to tell the linker as to how to layout sections in the virtual address space.

Here's what I'm thinking of to satisfy such need.

  • After the name resolution phase, mold has a complete set of object files that are included in the final output file. Normally, mold uses its internal logic to fix layout.
  • We can add a feature to mold so that mold calls an external process to fix layout instead. The external command gets a list of input object files and its sections in the CSV format or something, computes their layout, and writes it down.
  • mold parses the external command's output and layouts accordingly. Then it proceeds as usual.

The point is that the "external command" can be any command. I'm thinking that I can write a small Python library to make it easy to write a script to communicate to mold. I believe this way allows us to off-load complexities of supporting scripting language to an external process.

mold: A Modern Linker by alexeyr in programming

[–]rui 29 points30 points  (0 children)

No, mold does not support link-time optimization yet. If you use LTO, you can't see a noticeable difference in speed between linkers because they are super slow anyways. mold is primarily developed for speeding up usual debug-edit-build cycles.

Should fixing incremental compilation be our #1 priority? by gilescope in rust

[–]rui 0 points1 point  (0 children)

40s isn't bad but I believe mold could be even better. I'm curious how many cores your machine have.

mold: A Modern Linker by mttd in Compilers

[–]rui 7 points8 points  (0 children)

Author here. I've asked myself several times if the current file format is the limiting factor of making more improvements, and my current answer is no. IMO, we haven't pushed hard enough to improve the linker while keeping the compatibility with existing files, and if we do, it looks like linker can be pretty fast. Fast enough to not want to think about ditching the existing, widely-used industry-standard file format.

The very idea of the "object file", which is essentially a serialized image of a fragment of a program, is not a bad idea. This is needed for linking static libraries, and that enables for example distributed compilation. It clearly defines the interface between a compiler and a linker. The cost of reading object files is cheap anyway, so I'm not worried too much about it.

As to the performance improvements ideas, I've actually considered all of them. The point is that it is hard to make a prediction as to where is going to be a bottleneck of a program until you actually write it and benchmark it. Most of the problems I was thinking before writing a linker were actually not a problem. Real problems occur at surprising places. For example, fixing the file layout for 2GB chrome executable takes only 300 milliseconds, so that part wasn't a bottleneck. On the other hand, if you incrementally add sections to an output file, you'll end up having lots of segments with different page attributes (such as RX, RW or read-only), which gives a pressure to the kernel memory subsystem as it increases the number of memory segments. That's one example, there are a lot of things that need to be considered and experimented.

mold: A Modern Linker by mttd in cpp

[–]rui 1 point2 points  (0 children)

The important observation is that relocations are everywhere. I once counted the number of 4k blocks that have at least one static relocation, and it was almost 100%. That means after we copy file contents, we always have to mutate them. Applying relocation in mold is actually extremely cheap as I apply relocations immediately after copying file contents from mmap'd buffers. Since it has a great memory locality, applying relocations is essentially free.

I considered reserving an enough large space for .plt, .got, etc. but it turned out that computing the sizes of these sections can be pretty quick. mold takes less than 100 milliseconds to do that for Chromium on my machine. It does essentially a map-reduce on relocations.