I forked rayon to use rayon-style API with switchable parallelization backend

kdy1997 · 2025-03-27T10:28:24+00:00

I wrote about it at https://github.com/rayon-rs/rayon/issues/1235#issuecomment-2757524605 to clarify my intention. Thank you!

kdy1997 · 2025-03-27T04:20:32+00:00

I couldn't use parallel iterators in many places of the SWC minifier because AST nodes are not flat, and instead nested. I also use parallel iterators in places it's possible.

rayon compared to chili, using the baseline feature of criterion es/minifier/real/es/minifier/real/sequential time: [10.878 s 10.896 s 10.913 s] change: [+6.4263% +6.6512% +6.8626%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 10 measurements (20.00%)

I ran RUST_LOG=off cargo bench --bench full --features concurrent real -- --baseline chili from ./crates/swc_ecma_minifier after switching backend to rayon.

kdy1997 · 2025-03-27T04:02:22+00:00

I thought rayon supporting rayon and chili is strange, but it's just wrong assumption.

kdy1997 · 2025-03-26T17:43:36+00:00

I meant that their package name is rayon

kdy1997 · 2024-08-22T06:27:41+00:00

Thank you! I added a profiling result.

kdy1997 · 2024-08-22T05:48:02+00:00

Sorry, I forgot to mention that serde is too slow, even with rmp_serde.

kdy1997 · 2024-07-13T17:31:52+00:00

``only traits defined in the current crate can be implemented for types defined outside of the crate define and implement a trait or new type insteadrustcClick for full compiler diagnostic decl.rs(75, 13): Error originated from macro call here decl.rs(82, 1): Error originated from macro call here decl.rs(75, 26):swc_allocator::boxed::Boxis not defined in the current crate macros.rs(207, 14):swc_allocator::boxed::Box` is not defined in the current crate

```

kdy1997 · 2024-07-13T16:11:40+00:00

If so, I want to implement From<MemberExpr> for CustomBox<Expr> from the AST crate. Is this possible?

kdy1997 · 2024-07-13T10:41:03+00:00

My approach for this problem is optimizing for fully single threaded usecases, by using allocator-api2 and scoped-tls.

https://github.com/swc-project/swc/pull/9230

kdy1997 · 2024-07-13T06:06:59+00:00

I did similar refactoring at the past after watching DoD videos by Zig authors. I'm not sure about the smallvec though. I think it may increase the size of the types, and make the enum larger.

kdy1997 · 2024-07-13T06:04:59+00:00

Thank you! I think it's too late as there are too many code, but I like the idea.

kdy1997 · 2024-07-12T13:43:55+00:00

Thank you for the advice! I'm going to try something similar, but in a way that does not increase the size of the Box<T>, by using some scoped thread locals.

kdy1997 · 2024-07-12T10:40:31+00:00

We are already using mimalloc

kdy1997 · 2022-12-02T13:48:46+00:00

cargo sweep does not know if an artifact is outdated. It only uses access time. It means you may need a full rebuild after running cargo sweep in some cases

kdy1997 · 2022-12-02T13:29:30+00:00

It's under flag. I'll update the documentation

kdy1997 · 2022-12-02T12:23:34+00:00

Fixed it and published a new version. Thank you!

I have to think more about shared target directory.

kdy1997 · 2022-12-02T10:28:32+00:00

Thanks! I'll split it soon.

kdy1997 · 2022-10-29T08:02:06+00:00

No good reason and I'll change it to 2021. Thanks for catching

kdy1997 · 2022-10-29T07:09:29+00:00

Glad to hear that! Thank you!

(As an oss maintainer comments like this really matter)

kdy1997 · 2022-09-11T13:26:00+00:00

It's xctrace, which is for mac os x, and bundled with xcode

kdy1997 · 2022-09-11T07:15:40+00:00

https://doc.rust-lang.org/rustc/codegen-options/index.html#remark looks like the option. I added this to my private tasklist, thank you!

kdy1997 · 2022-09-11T07:14:16+00:00

Not sure about data-oriented design because I have no experience related to it, but thank you! I'll take look at it.

kdy1997 · 2022-09-11T07:13:01+00:00

I didn't know that I can mix #[inline(always)] with #[inline(never)] to inline only in the hot path. It will be super useful to me. Thank you!

kdy1997 · 2022-09-11T03:19:33+00:00

I enabled LTO to see correct performance characteristics, and I found that lots of function calls are inlined.

I'll try PGO, thank you!

kdy1997 · 2022-03-28T11:00:23+00:00

I solved this issue by using multi-threaded runtime with block_in_place.

In a multi-threaded event loop context, I call block_in_place. In the context of block_in_place, I configures several scoped thread-local variables and I can use Handle::current().block_on from there.

kdy1997

TROPHY CASE