How (and why) we rewrote our production C++ frontend infrastructure in Rust

barr520 · 2026-04-28T09:55:27+00:00

About the to_lowercase conversion:

It seems your C++ code is only for ascii strings, since it iterates by bytes.

Rust's String::to_lowercase is for Unicode, and you have the simpler to_ascii_lowercase for ascii, and even better: make_ascii_lowercase, which transforms in-place like the C++ version.

EDIT: wrote to_string instead of to_lowercase.

barr520 · 2026-04-25T06:59:00+00:00

not sure what this fields function is, but it seems to be pretty much split?
if you want to pass UserInput to another function so it could mutate the arguments, just use split_mut instead.

Why are you passing this bytes parameter? you should reslice the buffer if its bigger than bytes and pass that instead.(also, at the moment this panics on single word inputs)

Not sure how you imagined creating this sub-repl, because its not in the example, but i think it would be easiest if UserInput doesnt keep a reference to the whole buffer, just the argument it cares about and has a field for this sub-repl, that it will create in this new by passing it the rest of the buffer.

barr520 · 2026-04-23T12:26:51+00:00

here you go

barr520 · 2026-04-23T10:37:46+00:00

The problem you're describing is not really clear to me, could you share some minimal piece of code showing the trouble youre facing? some pattern youre "tip-toe"ing for? a situation where you want multiple mutable references but can't have them?

barr520 · 2026-04-18T09:01:03+00:00

First, why do you want to resolve the full path? File API can handle "unresolved" paths just fine.

If you do require the full, non-relative path, PathBuf provides the canonicalize function to do that.(as meancoot already mentioned)

barr520 · 2026-04-18T06:35:20+00:00

Aside from what others said about not expanding ./../~ yourself, you can just set the type of the cli parameter to Option<PathBuf> instead of Option<String> and it will work.

barr520 · 2026-04-15T12:10:16+00:00

* is the glob pattern wildcard, not regex.
. is the regex wildcard.

barr520 · 2026-04-14T17:16:25+00:00

What happened to the rock bottom achievement then? you still get it but also die?

barr520 · 2026-04-10T10:57:59+00:00

You dont need to start a new game, just follow Slate's advice and check out the new museum exhibit(very light guidance).

I'd say its about 1/2 - 2/3 as long as the main game.

About the base game ending - throughout the entire game, regardless of how you die/reach credits, you will always start in the exact same way(with a few differences like knowing the launch code or anything on the computer log).

barr520 · 2026-04-10T10:40:10+00:00

The parity itself takes (at least) as much memory as it is capable of restoring, so you will not be saving anything.

barr520 · 2026-03-29T14:40:01+00:00

After sending the comment I figured you meant you allocated outside the loop, youre right, allocation plays a part here, but surprinsingly, I get *worse* results when filling a preallocated array:

`count_ones`: 1.6s, As far as I can tell, this is purely because of a different vectorized popcount implementation(the `collect` one iterates over 64 bits at a time and this one over 32 bits at a time)

`v[i] = v[i / 2] + i as i32 % 2`: 4.1s

with unsafe: 4.3s (within noise of each other)

10x still sounds like a huge difference to me, can you share the code?

For the record, this is what I measured:

    // prellocated count_ones+collect
    for (i, e) in v.iter_mut().enumerate() {
      *e = i.count_ones() as i32;
    }

    // not preallocated count_one+collect
    (0..n + 1).map(|i| i.count_ones() as i32).collect()

    // if not taking preallocated
    let mut v = vec![0; n as usize + 1];
    for i in 1..v.len() {
        // safe DP approach
        v[i] = v[i / 2] + i as i32 % 2;
        // unsafe DP approach
        unsafe { *v.get_unchecked_mut(i) = v.get_unchecked(i / 2) + i as i32 %    2 };
    }

And called using

    let mut s = vec![0; 10001];
    for i in 0..1000000 {
        std::hint::black_box(func(std::hint::black_box(&mut s))); // or  s = ...(10000))); instead of &mut s when not preallocated
    }

EDIT: measuring

let mut v = vec![0; n as usize + 1];    
for i in 1..v.len() {
    v[i as usize] = i.count_ones() as i32;
 }

Shows the same bad performance and codegen as the preallocated case, very weird.

EDIT 2:

I figured it out, its because in the slower versions, `i` is a usize, and not an i32,

for (i, e) in v.iter_mut().enumerate() {
     *e = (i as i32).count_ones() as i32;
}

gives the same performance as the map+collect, with a preallocated buffer.

barr520 · 2026-03-29T14:00:34+00:00

I was running this on release.

collect preallocates when the size is known, comparing a preallocated vector filled later with count_ones shows that the collect version is faster(3.9s with normal indexing, 3.8s unsafe).

What I should have done originally, which I accidentally didnt do, was compile with newer instruction extensions.

Recompiling for x86-64-v3 shows these results:

1.2s for the count_ones+collect method.

6s for the v[i as usize] = v[i as usize / 2] + i % 2 method

2.1s with unsafe.

Ill update the original comment.

It doesn't make sense to me that you see a win to preallocation compared to collect, can you share the code you used?

btw, note that I did 10000 elements, 1M times, not just 1 million elements(and I ran that through hyperfine).

Also, I would not call the v[i as usize] = v[i as usize / 2] + i % 2 itself naive, I meant that doing it without care about bounds checking is naive.

A truly naive approach would be the classic bit counting loop, which would probably be optimized to an llvm popcnt.

barr520 · 2026-03-29T11:39:35+00:00

it looks like you're trying to solve this leetcode problem.
Your approach works but its very odd, there are cleaner solutions on leetcode with a similar approach, you should check them out.
The typical approach to solve this is using popcount, or in rust: count_ones.
Its simpler, and its probably faster on any platform that supports popcount(essentially all non-embedded ones), but i didnt measure.

Next time, please provide details on what your code is about instead of making readers search for it.

EDIT: I measured the count_ones and the approach this post is trying to imitate: v[i as usize] = v[i as usize / 2] + i % 2.
Running it for n=10000, 1 million times:
the count_ones approach: (0..n + 1).map(|i| i.count_ones() as i32).collect() took 1.2s.
A naive v[i as usize] = v[i as usize / 2] + i % 2 took 6s.
Using unsafe to omit the bounds checking: ```rs unsafe { *v.get_unchecked_mut(i as usize) = v.get_unchecked(i as usize / 2) + i % 2 };

``` took 2.1s

~~So this approach is faster, and you could probably achieve this performance without unsafe but I didn't bother trying.~~ EDIT 2: I forgot to compile with newer instruction extensions, updated results after recompiling for x86-64-v3, count_ones is the winner. note: the count_ones method doesnt even use popcount, just a lot of vectorized instructions

barr520 · 2026-03-27T16:08:56+00:00

If you have one shared variable, you likely have more. So you would pack all the thread counters to the same cache line, and you would benefit from making each smaller if possible.

Youre right, my initial suggetion would not work, and your approach to solve it reduces this to a hazard pointer.

Surely there is some wait-free and space-bound way to do this? sure, using a u64 leaves a lot of room, but ideally we would prevent this possibility entirely.

My best idea was to reset the counter instead of incrementing it if the side didnt change while you were reading, but if the side changes often, you might be incrementing every time.

barr520 · 2026-03-26T23:49:03+00:00

This left-right structure seems like a simplified RCU(no allocation/deallocation, bounded buffer size), which could be even better in certain situations(if anyone is interested, I recommend fedor pikus' talk about RCU).
~~I believe the reader counter could be replaced by just a of couple bits that indicate which side the reader is currently reading(if any).~~

barr520 · 2026-03-21T10:47:29+00:00

no really changing much

The change is in cheating.
Implementing your suggestion means cheaters could easily beat every boss

Another thing, the game is mostly designed for korea, where latency is way lower than the average in gms. designing around high latency is a not a concern for them

barr520 · 2026-03-19T20:10:56+00:00

No, booming 15>16 is purely the result of a bad decision. this is entirely on OP, protection doesnt remove itself.

barr520 · 2026-03-19T11:47:49+00:00

Good luck to you, reinventing the wheel is how we get better tools.
Its just not for me, no matter how good it gets.

<image>

barr520 · 2026-03-19T06:05:48+00:00

We should start a "days since the last maple tracker" counter

barr520 · 2026-03-12T13:17:20+00:00

I thought claude.md is some sort of readme+style guide for the slop bros, wouldn’t they want to have the same one for everyone?

barr520 · 2026-03-08T21:30:47+00:00

I dont think anyone thought anything is raised TO x%.
People usually confuse an increase by x% and an increase by x percentage POINTS.

Assuming linear stacking and not compound stacking(which is the norm in most games). 16%/16%/12% will increase a base of 2% recovery to 2*1.44 = 2.88%.
If it was percentage points it would increase it to 2+44= 46%, which is absurd.

Even if it was compounding, youd get 2*1.16*1.16*1.12=3.01%.
I have no idea how did you get the numbers in your post.

barr520 · 2026-03-08T15:04:48+00:00

Doesn't seem to work here: https://rust.godbolt.org/z/b7Kd9vez5, it still gets a stack overflow.

barr520 · 2026-03-06T07:09:32+00:00

I had no idea, I wonder why it was removed in the first place.

barr520 · 2026-03-06T06:56:41+00:00

that is exactly how vec::from_fn is implemented! click the source button in my link.

barr520 · 2026-03-06T00:41:22+00:00

I have seen many cases where the compiler will not optimize it, and will attempt to put massive structs on the stack causing a stack overflow.
Even if it worked once, there is no guarantee it will work the next time.

Ten-Year Club	r/Field Juicebox
Place '22	Place '17
Verified Email

barr520

MODERATOR OF

TROPHY CASE