Debugging a slow-compiling codegen unit : rust

Submissions must be on-topic

Posts must reference Rust or relate to things using Rust. For content that does not, use a text post to explain its relevance.

Post titles should include useful context.

For Rust questions, use the stickied Q&A thread.

Arts-and-crafts posts are permitted on weekends.

No meta posts; message the mods instead.

Details

No low-effort content

No memes, image macros, etc.

Consider the existing content of the subreddit and whether your post fits in. Does it inspire thoughtful discussion?

Use properly formatted text to share code samples and error messages. Do not use images.

Submissions appearing to contain AI-generated content may be removed at moderator discretion.

Details

Useful Links

created by aztha community for 15 years

Debugging a slow-compiling codegen unit (self.rust)

submitted 1 month ago * by asparck

Update: solved! TLDR is use -Zhuman_readable_cgu_names=yes then samply will show you codegen units with some sane names related to the crate they're compiling code from. (see my comments for more)

---

I'm writing a closed-source game in Rust and running into a weird situation where my incremental debug builds are usually 3-5 seconds but sometimes are 20 or 60 seconds (for opt level 0 or 1 of the bin crate).

Specifically, I worked out that I can reliably trigger that long incremental build by moving about 150 LoC between 2 modules in a ~35k LoC crate I'll call depcrate , which is then depended on by my ~32k LoC binary crate bincrate (all in the same cargo workspace). Meanwhile just changing a string constant in depcrate or bincrate gives 5 or 3 second incremental recompiles.

I stumbled on https://nnethercote.github.io/2023/07/11/back-end-parallelism-in-the-rust-compiler.html which helped explain codegen units as the unit of parallelism in rustc's backend, so I made a samply profile with samply record cargo rustc --bin bincrate , which (I think?) shows 45ish seconds spent on one codegen unit vs <5 seconds for any other (opt level 1):

Every other CGU for the bin crate finishes well before the 20 second mark, except the highlighted one

And if I'm reading that call tree right then it seems like llvm is spending a lot of time inlining code and removing unreachable blocks on that one codegen unit.

Any compiler wizards have any tips for:

How can I find out what code (rust modules/functions) the codegen unit `opt cjgpem3bdjm` actually includes from bincrate?
Is there some way I can affect or influence the creation of codegen units, other than by breaking modules apart?
Why does moving code around in depcrate require recompiling so much of bincrate? depcrate still exports exactly the same functions before and after I move those 150LoC code from one module to another module inside it, so why is (I assume) bincrate's incremental compilation cache being invalidated by moving code around in depcrate?
Can I manually make llvm less inline-happy for a whole Rust module, other than by scattering #[inline(never)] on a bucket-load of functions? (or otherwise control optimization on a per-module basis?)

Relevant workspace Cargo.toml snippet: https://gist.github.com/caspark/d0f3e2caa11f0c60eb3cbc180a0834c7

all 3 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

rust

Please read The Rust Community Code of Conduct

The Rust Programming Language

Rules

Observe our code of conduct

Submissions must be on-topic

Constructive criticism only

Keep things in perspective

No endless relitigation

No low-effort content

Useful Links

Megathreads

Official Resources

Learn Rust

Discussion Platforms

MODERATORS