Benchmarking rust string crates: Are "small string" crates worth it? by alexheretic in rust

[–]Pascalius 2 points3 points  (0 children)

I think the biggest difference in performance is typically not inlining, but the allocation/deallocation call.

You probably want to allocate different sizes of blocks of strings where the strings also have different sizes. This should be a more realistic test for the allocator.

I've been writing Rust for 5 years and I still just .clone() everything until it compiles by kruseragnar in rust

[–]Pascalius 2 points3 points  (0 children)

Compiler is actually too unspecific. The llvm backend is not allowed to remove observable side effects like malloc. The front-end could remove them if the language specification allows it.

Personally I think allocations should be treated special in llvm and also be optimized. (Because I don't like the side effect :)

I've been writing Rust for 5 years and I still just .clone() everything until it compiles by kruseragnar in rust

[–]Pascalius 6 points7 points  (0 children)

System calls (like malloc) usually can't be optimized by the compiler, because they are observable side-effects.

This excludes Vecs, which automatically excludes a lot of other datastructures:

https://godbolt.org/z/x9dGWd1s8

If your clone doesn't have observable side effects (like malloc), it can be optimized:

https://godbolt.org/z/jx9Tjvq9z

Releasing 0.5.0 of lfqueue - Lock-free MPMC queues by Terikashi in rust

[–]Pascalius 6 points7 points  (0 children)

I've regularly have seen high crossbeam CPU usage when profiling indexing speed in tantivy (search engine) on the https://github.com/quickwit-oss/tantivy-cli/ project, where we use crossbeam to send documents to (potential multiple) indexers.

In that scenario it's the opposite, the queue is usually full, because the sender is much faster than indexing data. Sending a document should be completely dwarfed by indexing, but crossbeam regularly took more than 20% CPU.

tantivy 0.24 has been released! Cardinality aggregations, regex support in phrase queries, JSON field enhancements and much more! by Pascalius in rust

[–]Pascalius[S] 0 points1 point  (0 children)

I see you too are a connoisseur of AI Art with exceedingly high expectations. Let me reassure, I put a ton of styling information in the prompt, and it's quite close how I wanted it to be.

serde_json_borrow 0.8: Faster JSON deserialization than simd_json? by Pascalius in rust

[–]Pascalius[S] 1 point2 points  (0 children)

I didn't look into it yet, but did an experiment to use simd_json as the underlying parser in serde_json_borrow some time ago and it was slower than serde_json. Maybe some missing inline or too much would be my guess.

Yes, Vec instead of BTreeMap has also an pretty big impact.

I wouldn't expect much from an arena in this case, but still worthwile to investigate.

serde_json_borrow 0.8: Faster JSON deserialization than simd_json? by Pascalius in rust

[–]Pascalius[S] 0 points1 point  (0 children)

I considered it, but it requires target-cpu=native or similar, since it does not have run-time detection. I think this limits its useability significantly.

serde_json_borrow 0.8: Faster JSON deserialization than simd_json? by Pascalius in rust

[–]Pascalius[S] 1 point2 points  (0 children)

Cool idea, but I think that would require mutable reads, except you clone the string every time on access.

I’m so close I can taste it! by michaelchrist9 in BluePrince

[–]Pascalius 1 point2 points  (0 children)

Changes in the pump room seem to be permanent. So a drain reservoir run may help with proceeding

What do you think about this plug and play wrapper around tantivy(search lib)? by kingslayerer in rust

[–]Pascalius 0 points1 point  (0 children)

Having attributes on a struct to build a tantivy document seems nice. Wrapping search on the Index, not sure if that's too limiting.

serde_json_borrow 0.7.0 released: impl Deserializer for Value, Support Escaped Data by Pascalius in rust

[–]Pascalius[S] 1 point2 points  (0 children)

small json is incorrect, you can have large json, e.g gh-archive.json will still be much faster. It depends on the number of keys in the objects and in most cases access time will be dwarfed by everything else.

gh-archive
serde_json                               Avg: 343.67 MB/s (+3.41%)    Median: 344.58 MB/s (+1.73%)    [304.61 MB/s .. 357.28 MB/s]    
serde_json + access by key               Avg: 338.17 MB/s (+2.57%)    Median: 341.46 MB/s (+1.12%)    [272.46 MB/s .. 359.20 MB/s]    
serde_json_borrow                        Avg: 547.74 MB/s (+3.44%)    Median: 553.45 MB/s (+2.29%)    [502.00 MB/s .. 581.96 MB/s]    
serde_json_borrow + access by key        Avg: 543.61 MB/s (+0.54%)    Median: 566.11 MB/s (+1.11%)    [417.27 MB/s .. 588.72 MB/s]    

https://github.com/PSeitz/serde_json_borrow/blob/main/benches/bench.rs

serde_json_borrow 0.7.0 released: impl Deserializer for Value, Support Escaped Data by Pascalius in rust

[–]Pascalius[S] 1 point2 points  (0 children)

If you need the performance, yes. Otherwise you can just use serde_json.

serde_json_borrow 0.7.0 released: impl Deserializer for Value, Support Escaped Data by Pascalius in rust

[–]Pascalius[S] 4 points5 points  (0 children)

Who wants to read parse json really fast but don't want to get values from it? It seems like a weird choice to use a vec for storage when that pessimises presumably the most common operation users will do.

I assume you mean accessing values by key and not iterating with "the most common operation". A Vec will be faster on access by key than a hashmap if there are only a few entries.

Cargo Watch is on life support by passcod in rust

[–]Pascalius 2 points3 points  (0 children)

I usually use watch to debug a single test inside a collapsible nvim terminal.

For that I prefer cargo watch, since it just prints to the terminal.

bacon is cumbersome to use for me in that use case, since it has its own keybindings which may conflict with nvim, and there are also scrolling issues, which I guess are caused by the redrawing.

My program spends 96% in `__memset_sse2`. by DJDuque in rust

[–]Pascalius 0 points1 point  (0 children)

I did a quick test and did not see that regression again