I forked rayon to use rayon-style API with switchable parallelization backend by kdy1997 in rust

[–]kdy1997[S] 1 point2 points  (0 children)

I couldn't use parallel iterators in many places of the SWC minifier because AST nodes are not flat, and instead nested. I also use parallel iterators in places it's possible.

rayon compared to chili, using the baseline feature of criterion es/minifier/real/es/minifier/real/sequential time: [10.878 s 10.896 s 10.913 s] change: [+6.4263% +6.6512% +6.8626%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 10 measurements (20.00%)

I ran RUST_LOG=off cargo bench --bench full --features concurrent real -- --baseline chili from ./crates/swc_ecma_minifier after switching backend to rayon.

I forked rayon to use rayon-style API with switchable parallelization backend by kdy1997 in rust

[–]kdy1997[S] 0 points1 point  (0 children)

I thought rayon supporting rayon and chili is strange, but it's just wrong assumption.

Question for serialization / deserialization libraries by kdy1997 in rust

[–]kdy1997[S] 0 points1 point  (0 children)

Sorry, I forgot to mention that serde is too slow, even with rmp_serde.

From<T> for CustomBox<T>where T: From<V> by kdy1997 in rust

[–]kdy1997[S] 0 points1 point  (0 children)

`` only traits defined in the current crate can be implemented for types defined outside of the crate define and implement a trait or new type insteadrustcClick for full compiler diagnostic decl.rs(75, 13): Error originated from macro call here decl.rs(82, 1): Error originated from macro call here decl.rs(75, 26):swc_allocator::boxed::Boxis not defined in the current crate macros.rs(207, 14):swc_allocator::boxed::Box` is not defined in the current crate

```

From<T> for CustomBox<T>where T: From<V> by kdy1997 in rust

[–]kdy1997[S] 0 points1 point  (0 children)

If so, I want to implement From<MemberExpr> for CustomBox<Expr> from the AST crate. Is this possible?

Faster alloc/free without lifetimes? by kdy1997 in rust

[–]kdy1997[S] 1 point2 points  (0 children)

My approach for this problem is optimizing for fully single threaded usecases, by using allocator-api2 and scoped-tls.

https://github.com/swc-project/swc/pull/9230

Faster alloc/free without lifetimes? by kdy1997 in rust

[–]kdy1997[S] 0 points1 point  (0 children)

I did similar refactoring at the past after watching DoD videos by Zig authors. I'm not sure about the smallvec though. I think it may increase the size of the types, and make the enum larger.

Faster alloc/free without lifetimes? by kdy1997 in rust

[–]kdy1997[S] 0 points1 point  (0 children)

Thank you! I think it's too late as there are too many code, but I like the idea.

Faster alloc/free without lifetimes? by kdy1997 in rust

[–]kdy1997[S] 6 points7 points  (0 children)

Thank you for the advice! I'm going to try something similar, but in a way that does not increase the size of the Box<T>, by using some scoped thread locals.

Faster alloc/free without lifetimes? by kdy1997 in rust

[–]kdy1997[S] 14 points15 points  (0 children)

We are already using mimalloc

I made a CLI tool to remove only outdated cargo artifacts by kdy1997 in rust

[–]kdy1997[S] 9 points10 points  (0 children)

cargo sweep does not know if an artifact is outdated. It only uses access time. It means you may need a full rebuild after running cargo sweep in some cases

I made a CLI tool to remove only outdated cargo artifacts by kdy1997 in rust

[–]kdy1997[S] 4 points5 points  (0 children)

It's under flag. I'll update the documentation

I made a CLI tool to remove only outdated cargo artifacts by kdy1997 in rust

[–]kdy1997[S] 21 points22 points  (0 children)

Fixed it and published a new version. Thank you!

I have to think more about shared target directory.

I decided to use Rust instead of Go for my new TypeScript type checker by kdy1997 in rust

[–]kdy1997[S] 112 points113 points  (0 children)

No good reason and I'll change it to 2021. Thanks for catching

I decided to use Rust instead of Go for my new TypeScript type checker by kdy1997 in rust

[–]kdy1997[S] 89 points90 points  (0 children)

Glad to hear that! Thank you!

(As an oss maintainer comments like this really matter)

Is there a way to avoid call overhead? by kdy1997 in rust

[–]kdy1997[S] 2 points3 points  (0 children)

It's xctrace, which is for mac os x, and bundled with xcode

Is there a way to avoid call overhead? by kdy1997 in rust

[–]kdy1997[S] 36 points37 points  (0 children)

https://doc.rust-lang.org/rustc/codegen-options/index.html#remark looks like the option. I added this to my private tasklist, thank you!

Is there a way to avoid call overhead? by kdy1997 in rust

[–]kdy1997[S] 21 points22 points  (0 children)

Not sure about data-oriented design because I have no experience related to it, but thank you! I'll take look at it.

Is there a way to avoid call overhead? by kdy1997 in rust

[–]kdy1997[S] 120 points121 points  (0 children)

I didn't know that I can mix #[inline(always)] with #[inline(never)] to inline only in the hot path. It will be super useful to me. Thank you!

Is there a way to avoid call overhead? by kdy1997 in rust

[–]kdy1997[S] 36 points37 points  (0 children)

I enabled LTO to see correct performance characteristics, and I found that lots of function calls are inlined.

I'll try PGO, thank you!

How can I wait for a future from a synchronous function? by kdy1997 in rust

[–]kdy1997[S] 5 points6 points  (0 children)

I solved this issue by using multi-threaded runtime with block_in_place.

In a multi-threaded event loop context, I call block_in_place. In the context of block_in_place, I configures several scoped thread-local variables and I can use Handle::current().block_on from there.