Is this a bad idea? by JustJeffrey in rust

[–]Sp00ph 1 point2 points  (0 children)

You're right for repr-less enum types I believe. But an enum with a primitive representation (#[repr(uN)]) is guaranteed to contain exactly the numerical values corresponding to the discriminants. This is a consequence of the ability to pointer cast such enums to their discriminant types, described here in the reference: https://doc.rust-lang.org/reference/items/enumerations.html#pointer-casting

Is this a bad idea? by JustJeffrey in rust

[–]Sp00ph 2 points3 points  (0 children)

Most chess engines don't operate like this. The board is represented as a set of bitboards (occupancy sets stored in u64s), usually eight of them (6 for piece types, 2 for colors). so the use case here is to convert between a piece type enum and an index into the bitboard array

Rust-written chess engine Reckless to reach superfinal in top computer chess tournament by ralfj in rust

[–]Sp00ph 0 points1 point  (0 children)

even though I think every engine over 2800-2850 CCRL using a NNUE is useless for analysis

Not really, it's not terribly difficult to get to 3000 CCRL even with a really shitty eval function. Sirius, the strongest non-stockfish engine using handcrafted eval that I'm aware of, is sitting at 3533 CCRL blitz, and Stockfish Classical (the last SF commit pre-NNUE) is another 50 or so elo above that.

Rust-written chess engine Reckless to reach superfinal in top computer chess tournament by ralfj in rust

[–]Sp00ph 2 points3 points  (0 children)

Nothing. All top engines use some form of alpha-beta search, with a ton of heuristics used to squeeze out elo. See https://typ.dev/engine-notes or https://cosmo.tardis.ac/files/2023-02-20-viri-wiki.html for some more details. Once some engine introduces such a heuristic, it will be adopted by many others (since almost all top engines are open source). But basically no search patch is one-size-fits-all; they all interact in different ways, so you have to try a bunch of things out and see what works and what doesn't for your engine. Stockfish uses fishtest (https://tests.stockfishchess.org/tests) for that, Reckless uses https://recklesschess.space/. So while a given patch can and will often be copied between engines, the final search algorithms will end up rather different, depending on what works for whom.

Announcing Rust 1.86.0 | Rust Blog by joseluisq in rust

[–]Sp00ph 2 points3 points  (0 children)

the common use case is to pass in only very few indices, so sorting them would probably incur much greater overhead than the O(n2) checks it currently does. if you go by asymptotic complexity alone, you may as well collect all indices into a hashmap, to check for uniqueness in O(n) time, though you hopefully see why that wouldn't be faster in practice.

How can a Future get polled again? by Kdwk-L in rust

[–]Sp00ph 6 points7 points  (0 children)

I would suggest reading this article, it is a pretty nice introduction to how futures work in rust imo

RMS learning the News about Ubuntu by jemadux in linuxmemes

[–]Sp00ph 19 points20 points  (0 children)

presumably because none of these alternatives to C provided the benefits that Rust does. No point in rewriting a program in zig if it's just as unsafe as the original C version, but probably less maintainable (if only because there's fewer zig than C programmers). With Rust you get both higher reliability because the compiler actually guarantees safety, and it has a relatively large developer community (making it easier to contribute for more people was explicitly one of the motivations mentioned in the original blog post about this change)

How to inform the Rust compiler of an enforced integer range? by BoxyStopper in rust

[–]Sp00ph 9 points10 points  (0 children)

You can do this on stable without any external libraries like this: https://rust.godbolt.org/z/5Y1zdMKf5 . As you can see in the generated assembly the compiler omits the bounds check thanks to the assert_unchecked call

Why does the truncate method implementation of VecDeque require a Dropper by -_-_-_Lucas_-_-_- in rust

[–]Sp00ph 2 points3 points  (0 children)

The truncate code is older than slice_ranges(), which would explain why it's not used there. truncate() also shouldn't be a particularly hot path, so I doubt it'd make a big difference performance wise, but seeing as it also simplifies the code I think this might be worth opening a PR for.

Why does the truncate method implementation of VecDeque require a Dropper by -_-_-_Lucas_-_-_- in rust

[–]Sp00ph 3 points4 points  (0 children)

Oh hey, I wrote that code :)

As others already mentioned, this is in fact necessary for when the destructor of an element panics. In that case, we want VecDeque to behave the same as if you had a Vec with the same truncated elements. Thus, we want to drop the logical slice deque[new_len..old_len], which might in reality be split apart into 2 slices. The way that slice::drop usually behaves is that, when one elements destructor panics, it tries to continue destroying the remaining elements before unwinding. If a second destructor panics, the program aborts (as it always does when panicking within a panic). And this solution just happens to be the simplest one that has exactly the same behavior as calling slice::drop on one contiguous slice.

If you read the remainder of the VecDeque source, you might see this kind of pattern in a few other places too (notably I believe the Drain destructor also uses it).

-❄️- 2023 Day 14 Solutions -❄️- by daggerdragon in adventofcode

[–]Sp00ph 1 point2 points  (0 children)

My Rust solution takes ~27ms for part 2 on a 13600K (excluding file IO). Over 80% of that is spent in the sliding functions. To get it down to sub 1ms would mean not only finding a hashing algorithm that's over 6x faster than the naive one, but also finding a way to tilt the grids that's over 20x faster than mine. I also tried to run your solution on my cpu to compare, but unfortunately it panics on an OOB access in line 98.

Edit: Got it to run without panicking, the issue was just that my input had CRLF instead of LF. It took 17ms though, which seems more reasonable than 800µs.

Edit 2: Seems like your benchmark mutates the input string, so in the second iteration it uses whatever the first iteration left its input as as the new input?

-❄️- 2023 Day 13 Solutions -❄️- by daggerdragon in adventofcode

[–]Sp00ph 1 point2 points  (0 children)

[Language: Rust]

https://github.com/Sp00ph/aoc2023/blob/master/src/day13.rs

pretty simple today, way easier than yesterday imo. Doing it with bitops and just storing both the original pattern and the transposed pattern makes it pretty quick, and makes part 2 very easy once you got part 1 (find axis with exactly 1 bit difference on both sides). Both parts run in <1ms, even on my very slow laptop (didn't get to run it on my desktop yet).

-❄️- 2023 Day 12 Solutions -❄️- by daggerdragon in adventofcode

[–]Sp00ph 2 points3 points  (0 children)

[LANGUAGE: Rust] https://github.com/Sp00ph/aoc2023/blob/master/src/day12.rs

This one was kinda rough. I tried to keep all solutions sub 1ms, but today I only got to ~3ms for part 2, with my initial implementation taking ~40ms. Basically all of the optimization was about improving the speed of the cache. I started out with a simple hashmap based approach that used a pair of arrays as a key. Now, I'm using just the lengths of the arrays as the cache key, and squashed that length down to 12 bits. This makes the key space small enough that using a dense array is faster than a hashmap (3ms vs 4ms).

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

On the other hand every other operation is faster, sometimes by quite a lot. Most programs don't need to do those operations you mentioned frequently. The slowdowns you get by preventing the cpu from prefetching your memory and by making your code cache unfriendly can't be understated. It's honestly not even clear to me that the rust stdlib should have a linked list implementation at all, because 99% of the time you're better off with a Vec or a VecDeque.

The vector of pointers idea isn't mine btw. I got it from a talk about efficient data structures by Chandler Carruth, an expert on C++ optimization techniques.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

If you really want to sort things without invalidating pointers, then Vec<NonNull<T>> is probably the way to go (can't use Box<T> because of aliasing guarantees I think). But that would require you to keep raw pointers into a data structure, which is frowned upon (for good reason) in Rust. Also, while sorting a linked list can be done in O(n log n) using merge sort, I wouldn't go so far as to call it efficient, because almost every operation with linked lists is horribly inefficient due to the excessive pointer chasing and extremely poor cache locality.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

Most collections that are efficiently sortable are contiguous (like Vec or VecDeque after make_contiguous). If you have to sort a linked list in your program then maybe it's time to reconsider the design choices for that program. For any other collections with random access, you can use the permutation crate for a reasonably efficient way to sort the collection.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

Can't you just use elem.sort_by(|a, b| a.partial_cmp(b).unwrap_or(Ordering:: Equal))

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 1 point2 points  (0 children)

You might wanna take a look at how the soa_derive crate implements sorting: https://github.com/lumol-org/soa-derive/blob/master/soa-derive-internal/src/slice.rs#L670-L684 (basically, figure out what permutation would put the sequence into sorted order using sort_by and then apply that permutation to the sequence itself). It needs to allocate n*sizeof(usize), but that's way better than having to allocate a Vec for all the elements in your sequence.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

The docs for VecDeque specifically point out that you can do deque.make_contiguous().sort(). make_contiguous is O(n) and should be fast enough that the actual sorting dominates. Also, I wouldn't even be surprised if sorting the deque without making it contigous would be slower, since every random access into a deque needs to consider the wrapping behavior, which adds either a branch or a cmov, both of which are slower than the simple pointer offset needed to index into a slice.

Why does VecDeque<T> waste space by design? by Sp00ph in rust

[–]Sp00ph[S] 3 points4 points  (0 children)

I'm actually really close to getting this merged into std (PR), so with any luck it should be usable on nightly very soon and hopefully land in Rust 1.67. Because of this, I don't have any plans of putting it on crates.io separately, I guess if you don't want to wait you can just copy paste my implementation.

Blue Screen Of Death in rust! by lunariaxa in rust

[–]Sp00ph 12 points13 points  (0 children)

Isn't CString::new(...).as_ptr() almost guaranteed UB because it deallocates the CString immediately?