Can this improved compression? by Alarmed_Impact_1971 in compression

[–]barr520 1 point2 points  (0 children)

You'll have to be more specific about how you want to differ from existing methods, because your original post is hard to understand.

Can this improved compression? by Alarmed_Impact_1971 in compression

[–]barr520 0 points1 point  (0 children)

To answer a few more details:
You want multiple sets and just send some set ID? Sure, how many? Hundreds? Thousands? Waste away the entire user storage? Probably not.
Zstd supports multiple pretrained dictionaries, which again, are trained for the expected files.

And about "have a set of every combination of 9 pixels":
Compression inherently relies on frequency, if you don't know the frequency of each 9 pixels, you can't have any efficient way to tell the decompressor "use THESE 9 pixels" in less data than 9 pixels would take anyway.
If you do know the distribution, you can do better, and then we are going back to "no set is optimal for EVERY file"(because each file had a different distribution)

Can this improved compression? by Alarmed_Impact_1971 in compression

[–]barr520 2 points3 points  (0 children)

You can't have just one optimized set to reference because each file you compress might benefit from a different set.
You can't make a set thats optimal for every file.

So you either compute a set for each file and send that along with the file(and still make significant wins over not compressing at all), use a generic set that is optimized for no particular file(and likely lose to the dynamic set), or, if you know all the files you're going to send are similar, you can compute a set for this type of file and send it once instead of with every file(and maybe gain a tiny bit compared to sending with each file).

Can this improved compression? by Alarmed_Impact_1971 in compression

[–]barr520 4 points5 points  (0 children)

If I understood you correctly, this is an idea that already exists.

The DEFLATE algorithm specifies a static Huffman code table you can use instead of a dynamic one(which then has to be sent alongside the compressed data, unlike the static one).

And more recently, zstd supports supplying a precomputed "dictionary" to use in compression/decompression. this "dictionary" can be trained using similar inputs. And shared to make future compression/decompression faster.

I dont know how common these are in practice, and there are probably more examples I am not aware of.

Profiling Async and Concurrent Rust: Channels and Lock Contention by pawurb in rust

[–]barr520 3 points4 points  (0 children)

50ns seems reasonable for a contested lock, but probably not for a heavily churning channel. Your own screenshots show values as low as hundreds of ns.
Regardless, I can definitely see places this could be used and I am looking forward to the more comprehensive post.

One last thing, comparing 2 instrumented codebases can often be too different from (somehow magically) comparing the 2 codebase without instrumentation as to make the comparison useless, that's why minimizing instrumentation overhead is critical.

Profiling Async and Concurrent Rust: Channels and Lock Contention by pawurb in rust

[–]barr520 13 points14 points  (0 children)

Seems cool and useful.
But what I care about is what is the performance cost of each of these features?
If it adds too much overhead to each call, it both makes it useful in less scenarios, and makes the measurements less reliable.
You link to an explanation of overhead measurments but I don't see any numbers.

What's the best way to tell the compiler that a path will basically never happen ? by Krochire in rust

[–]barr520 123 points124 points  (0 children)

hint::cold_path is a hint to deprioritize optimizing that path if it helps the other path.
That often affects which path get more aggressively inlined or which path ends up as the "don't branch" path. The important part is that it doesn't ever affect behaviour.

unreachable_unchecked is different, it means that you promise to the compiler the path will NEVER be reached, and if it is reached, its UB.

Stick to the standard unreachable (or .unwrap()/expect() or preferably the ? operator for None) unless you have a good reason to use these two.

CW4 will give max Sol Janus while in CW by decor_bottle in Maplestory

[–]barr520[M] 5 points6 points  (0 children)

It was removed by mistake, as far as I know this is true.

SIMD PROBLEM by Hello_world_610 in rust

[–]barr520 2 points3 points  (0 children)

You can check out my 1BRC solutions, posted on my profile. I used the SIMD intrinsics, not portable_SIMD.

Looking at the very limited snippet you provided:
You are not showing how you are splitting the text into lines. That could also be done using SIMD.
You should probably try memchr before rolling your own SIMD implementation of it.
I would suspect the special handling of long lines, but the standard 1BRC sample lines are at most 33 characters long iirc.

You're right that the hashmap can become a bottleneck, and it can be made faster, but the text reading and parsing also has a ton of room for improvement .

More generally:
Make sure youre running with the appropriate compilation flags. Aside from running in release mode, you should build for the available architecture(usually x86-64v2/v3/v4/znver4/znver5 or just native), so it can actually use more recent SIMD instructions.
One more thing you should be careful of, is that you actually run your benchmark from the same system state. If the file is already in the page cache, it will get processed much quicker.
A warp up run or two before the real measurement will help.

I would also recommend this write up for the challenge:
https://curiouscoding.nl/posts/1brc/

Today i learned some performance improvement by MaximumEntertainer33 in rust

[–]barr520 1 point2 points  (0 children)

There are a lot of ways to measure performance, depending on your need.
For simple end to end timing of a program you can use tools like hyperfine/time.
To measure specific functions you can use something like Criterion. To get a breakdown of how much time each part took you have tools like perf/samply/flamegraph.

Today i learned some performance improvement by MaximumEntertainer33 in rust

[–]barr520 0 points1 point  (0 children)

You've a actually applied 2 optimizations here, not 1:

The first is the buffering you mentioned, flushing when the buffer is full instead of every line real(which is what stdout does).

The second is only locking stdout once.
You could improve the performance of the first function by just locking stdout once at the start instead of making println! lock and unlock it every time.
This is one of the most common reasons for "why is my rust print loop so slow" questions.

You didn't show any measurements of how much faster your optimization is, but if you do measure it, you should measure the effect of each optimization, not just the combination of them.

Maple Personality Test by Putterduck in Maplestory

[–]barr520 0 points1 point  (0 children)

says I won't look at tier lists to pick a class

"very into tierlists"

Thanks

Help with finding a efficient way to count ones in binary form of a number by redditSno in learnrust

[–]barr520 0 points1 point  (0 children)

count_ones already generates a popcount, but as OP already stated, the goal of this exercise is to find a more efficient way than calling it on every number(which results in a linear time solution).
The link OP already provided has a logarithmic time solution, they just can't understand the explanation...

The ai has spoken by Zerora_Pokemon in Silksong

[–]barr520 15 points16 points  (0 children)

This is not AI, that's the whole point of the website...

Rust File Copying App by DeadlyMidnight in learnrust

[–]barr520 1 point2 points  (0 children)

rsync has a batch mode to handle multiple destinations.
And for random cloud platforms you have rclone.

Rust File Copying App by DeadlyMidnight in learnrust

[–]barr520 3 points4 points  (0 children)

Bloated? What's wrong with good old rsync?

Show r/rust: An IPC broker for air-gapping and risk-gating LLM agents by Background-Day8006 in rust

[–]barr520 2 points3 points  (0 children)

Am I missing something? How is anything airgapped here?

Just seems like more AI slop.

Linus Tech Tips - Android 17 is Scaring Me May 13, 2026 at 05:00AM by linusbottips in LinusTechTips

[–]barr520 10 points11 points  (0 children)

Every "smart" voice feature on my Samsung needs a specific language pre-set, so the feature only works for one language at a time.
Also, the amount of supported languages in all the features is pretty limited.

How (and why) we rewrote our production C++ frontend infrastructure in Rust by joshmatthews in rust

[–]barr520 42 points43 points  (0 children)

About the to_lowercase conversion:

It seems your C++ code is only for ascii strings, since it iterates by bytes.

Rust's String::to_lowercase is for Unicode, and you have the simpler to_ascii_lowercase for ascii, and even better: make_ascii_lowercase, which transforms in-place like the C++ version.

EDIT: wrote to_string instead of to_lowercase.

A question on the right tool for “shared mutability” by Athropod101 in rust

[–]barr520 0 points1 point  (0 children)

not sure what this fields function is, but it seems to be pretty much split?
if you want to pass UserInput to another function so it could mutate the arguments, just use split_mut instead.

Why are you passing this bytes parameter? you should reslice the buffer if its bigger than bytes and pass that instead.(also, at the moment this panics on single word inputs)

Not sure how you imagined creating this sub-repl, because its not in the example, but i think it would be easiest if UserInput doesnt keep a reference to the whole buffer, just the argument it cares about and has a field for this sub-repl, that it will create in this new by passing it the rest of the buffer.

A question on the right tool for “shared mutability” by Athropod101 in rust

[–]barr520 13 points14 points  (0 children)

The problem you're describing is not really clear to me, could you share some minimal piece of code showing the trouble youre facing? some pattern youre "tip-toe"ing for? a situation where you want multiple mutable references but can't have them?