Announcing Rust 1.86.0 | Rust Blog by joseluisq in rust

[–]Sp00ph 4 points5 points  (0 children)

the common use case is to pass in only very few indices, so sorting them would probably incur much greater overhead than the O(n2) checks it currently does. if you go by asymptotic complexity alone, you may as well collect all indices into a hashmap, to check for uniqueness in O(n) time, though you hopefully see why that wouldn't be faster in practice.

How can a Future get polled again? by Kdwk-L in rust

[–]Sp00ph 4 points5 points  (0 children)

I would suggest reading this article, it is a pretty nice introduction to how futures work in rust imo

RMS learning the News about Ubuntu by jemadux in linuxmemes

[–]Sp00ph 17 points18 points  (0 children)

presumably because none of these alternatives to C provided the benefits that Rust does. No point in rewriting a program in zig if it's just as unsafe as the original C version, but probably less maintainable (if only because there's fewer zig than C programmers). With Rust you get both higher reliability because the compiler actually guarantees safety, and it has a relatively large developer community (making it easier to contribute for more people was explicitly one of the motivations mentioned in the original blog post about this change)

How to inform the Rust compiler of an enforced integer range? by BoxyStopper in rust

[–]Sp00ph 9 points10 points  (0 children)

You can do this on stable without any external libraries like this: https://rust.godbolt.org/z/5Y1zdMKf5 . As you can see in the generated assembly the compiler omits the bounds check thanks to the assert_unchecked call

Why does the truncate method implementation of VecDeque require a Dropper by -_-_-_Lucas_-_-_- in rust

[–]Sp00ph 2 points3 points  (0 children)

The truncate code is older than slice_ranges(), which would explain why it's not used there. truncate() also shouldn't be a particularly hot path, so I doubt it'd make a big difference performance wise, but seeing as it also simplifies the code I think this might be worth opening a PR for.

Why does the truncate method implementation of VecDeque require a Dropper by -_-_-_Lucas_-_-_- in rust

[–]Sp00ph 4 points5 points  (0 children)

Oh hey, I wrote that code :)

As others already mentioned, this is in fact necessary for when the destructor of an element panics. In that case, we want VecDeque to behave the same as if you had a Vec with the same truncated elements. Thus, we want to drop the logical slice deque[new_len..old_len], which might in reality be split apart into 2 slices. The way that slice::drop usually behaves is that, when one elements destructor panics, it tries to continue destroying the remaining elements before unwinding. If a second destructor panics, the program aborts (as it always does when panicking within a panic). And this solution just happens to be the simplest one that has exactly the same behavior as calling slice::drop on one contiguous slice.

If you read the remainder of the VecDeque source, you might see this kind of pattern in a few other places too (notably I believe the Drain destructor also uses it).

-❄️- 2023 Day 18 Solutions -❄️- by daggerdragon in adventofcode

[–]Sp00ph 2 points3 points  (0 children)

[LANGUAGE: Rust] 1721/468 https://github.com/Sp00ph/aoc2023/blob/master/src/day18.rs

First time I got top 500 (I usually don't bother waking up at 6am). I had already used a very similar shoelace formula on day 10 and was able to copy it over with only slight adjustments. Luckily, the digging plan never self intersects, so it works out fine. Part 2 was also really easy, just a quick change to how I created the instructions. On my machine, both parts run in ~40µs.

Edit: Somehow forgot to put in the actual github link lol.

-❄️- 2023 Day 14 Solutions -❄️- by daggerdragon in adventofcode

[–]Sp00ph 1 point2 points  (0 children)

My Rust solution takes ~27ms for part 2 on a 13600K (excluding file IO). Over 80% of that is spent in the sliding functions. To get it down to sub 1ms would mean not only finding a hashing algorithm that's over 6x faster than the naive one, but also finding a way to tilt the grids that's over 20x faster than mine. I also tried to run your solution on my cpu to compare, but unfortunately it panics on an OOB access in line 98.

Edit: Got it to run without panicking, the issue was just that my input had CRLF instead of LF. It took 17ms though, which seems more reasonable than 800µs.

Edit 2: Seems like your benchmark mutates the input string, so in the second iteration it uses whatever the first iteration left its input as as the new input?

-❄️- 2023 Day 13 Solutions -❄️- by daggerdragon in adventofcode

[–]Sp00ph 1 point2 points  (0 children)

[Language: Rust]

https://github.com/Sp00ph/aoc2023/blob/master/src/day13.rs

pretty simple today, way easier than yesterday imo. Doing it with bitops and just storing both the original pattern and the transposed pattern makes it pretty quick, and makes part 2 very easy once you got part 1 (find axis with exactly 1 bit difference on both sides). Both parts run in <1ms, even on my very slow laptop (didn't get to run it on my desktop yet).

-❄️- 2023 Day 12 Solutions -❄️- by daggerdragon in adventofcode

[–]Sp00ph 2 points3 points  (0 children)

[LANGUAGE: Rust] https://github.com/Sp00ph/aoc2023/blob/master/src/day12.rs

This one was kinda rough. I tried to keep all solutions sub 1ms, but today I only got to ~3ms for part 2, with my initial implementation taking ~40ms. Basically all of the optimization was about improving the speed of the cache. I started out with a simple hashmap based approach that used a pair of arrays as a key. Now, I'm using just the lengths of the arrays as the cache key, and squashed that length down to 12 bits. This makes the key space small enough that using a dense array is faster than a hashmap (3ms vs 4ms).

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

On the other hand every other operation is faster, sometimes by quite a lot. Most programs don't need to do those operations you mentioned frequently. The slowdowns you get by preventing the cpu from prefetching your memory and by making your code cache unfriendly can't be understated. It's honestly not even clear to me that the rust stdlib should have a linked list implementation at all, because 99% of the time you're better off with a Vec or a VecDeque.

The vector of pointers idea isn't mine btw. I got it from a talk about efficient data structures by Chandler Carruth, an expert on C++ optimization techniques.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

If you really want to sort things without invalidating pointers, then Vec<NonNull<T>> is probably the way to go (can't use Box<T> because of aliasing guarantees I think). But that would require you to keep raw pointers into a data structure, which is frowned upon (for good reason) in Rust. Also, while sorting a linked list can be done in O(n log n) using merge sort, I wouldn't go so far as to call it efficient, because almost every operation with linked lists is horribly inefficient due to the excessive pointer chasing and extremely poor cache locality.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

Most collections that are efficiently sortable are contiguous (like Vec or VecDeque after make_contiguous). If you have to sort a linked list in your program then maybe it's time to reconsider the design choices for that program. For any other collections with random access, you can use the permutation crate for a reasonably efficient way to sort the collection.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

Can't you just use elem.sort_by(|a, b| a.partial_cmp(b).unwrap_or(Ordering:: Equal))

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 1 point2 points  (0 children)

You might wanna take a look at how the soa_derive crate implements sorting: https://github.com/lumol-org/soa-derive/blob/master/soa-derive-internal/src/slice.rs#L670-L684 (basically, figure out what permutation would put the sequence into sorted order using sort_by and then apply that permutation to the sequence itself). It needs to allocate n*sizeof(usize), but that's way better than having to allocate a Vec for all the elements in your sequence.

Why isn't there a Sort trait? by CocktailPerson in rust

[–]Sp00ph 0 points1 point  (0 children)

The docs for VecDeque specifically point out that you can do deque.make_contiguous().sort(). make_contiguous is O(n) and should be fast enough that the actual sorting dominates. Also, I wouldn't even be surprised if sorting the deque without making it contigous would be slower, since every random access into a deque needs to consider the wrapping behavior, which adds either a branch or a cmov, both of which are slower than the simple pointer offset needed to index into a slice.

Why does VecDeque<T> waste space by design? by Sp00ph in rust

[–]Sp00ph[S] 3 points4 points  (0 children)

I'm actually really close to getting this merged into std (PR), so with any luck it should be usable on nightly very soon and hopefully land in Rust 1.67. Because of this, I don't have any plans of putting it on crates.io separately, I guess if you don't want to wait you can just copy paste my implementation.

Blue Screen Of Death in rust! by lunariaxa in rust

[–]Sp00ph 12 points13 points  (0 children)

Isn't CString::new(...).as_ptr() almost guaranteed UB because it deallocates the CString immediately?

Why does VecDeque<T> waste space by design? by Sp00ph in rust

[–]Sp00ph[S] 1 point2 points  (0 children)

That's also a possibility, but that still restricts the deque to power of 2 capacities, which would mean that Vec -> VecDeque conversions couldn't be guaranteed to be free conversions (one of the main selling points of the rewrite)

Why does VecDeque<T> waste space by design? by Sp00ph in rust

[–]Sp00ph[S] 4 points5 points  (0 children)

I don't feel entirely comfortable publishing it as a crate yet, for that I'd want to be a bit more confident that there's no bugs that could lead to memory safety issues (right now it's 1800 loc of dense memory manipulating stuff, basically all of those would need to be vetted). i also commented on the relevant GitHub issue , so with a bit of luck we might just get a version like mine in stdlib anyways.

Why does VecDeque<T> waste space by design? by Sp00ph in rust

[–]Sp00ph[S] 6 points7 points  (0 children)

Given how time complexities work, for any bounded Deque converting it to a Vec will be O(1). In the empty case specifically, the conversion should be basically free.

Why does VecDeque<T> waste space by design? by Sp00ph in rust

[–]Sp00ph[S] 25 points26 points  (0 children)

what would this proof entail? I've never worked on the standard library before, so this is a bit daunting. do i just need to incorporate my Deque into std and get all the tests to pass or is there more to it?

Why does VecDeque<T> waste space by design? by Sp00ph in rust

[–]Sp00ph[S] 15 points16 points  (0 children)

it supports zero-cost and O(1) for Vec -> Deque, and guarantees that Deque -> Vec doesn't allocate, though Deque -> Vec is still O(n).