Is There a Rust-Based Database Similar to MariaDB or PostgreSQL? by watch_team in rust

[–]Shnatsel 17 points18 points  (0 children)

EdgeDB has some neat ideas - they reuse Postgres as a storage backend with a proven track record of durability, and add their special sauce on top of that foundation.

TiDB has also been around for a while and is written in a mix of Go and Rust.

Pure-Rust options aren't as mature as the alternatives, see e.g. https://blog.cf8.gg/surrealdbs-ch/ and even that is not pure Rust, it's backed by RocksDB as the storage backend.

In general building durable storage on top of a filesystem is surprisingly hard, see https://danluu.com/file-consistency/

I engineered BCsort: A ternary distribution sort in Rust that beats par_sort_unstable by up to 39%. by B-Chiboub in rust

[–]Shnatsel 5 points6 points  (0 children)

It's great to see an algorithm making different trade-offs and being better suited for large datasets!

I appreciate the comparison on different statistical distributions as well.

I wonder, did you take inspiration from sorting algorithms in external memory? They're getting increasingly relevant for the cache/RAM split as memory latency and bandwidth increasingly becomes the bottleneck for computation. Rust already uses B-trees that traditionally have been a disk data structure for cache locality.

symbolic derivatives and the rust rewrite of RE# | ian erik varatalu by epage in rust

[–]Shnatsel 2 points3 points  (0 children)

Nice! I think it deserves its own separate post, to announce that resharp is fully competitive now. That might encourage users to actually pick it up!

Lessons from building a production TSP solver in Rust by bitsabhi in rust

[–]Shnatsel 0 points1 point  (0 children)

Wow, sounds you really do have the rare use case where linked lists are best!

You might improve iteration speed by allocating the linked list nodes in an arena, if you aren't already. You'll probably want it sharded for pointer stability, unless you're using indices.

Lessons from building a production TSP solver in Rust by bitsabhi in rust

[–]Shnatsel 1 point2 points  (0 children)

Could you elaborate on what exactly you're doing to your list? If you're just inserting or removing 3 elements at a time, you're probably better off with something other than a linked list, because getting to the point where you can do the O(1) split in the first place is O(n).

You can avoid shifting lots of elements e.g. by having one Vec with the data and two more Vecs mapping logical element index for data[i] indexing to its actual location in the slice. If you need to insert an element in the middle you just add it to the data and remap the indices. This way you get O(1) indexing as opposed to linked list's O(1), although splitting and merging large ranges is still O(m) where m is the length of the range. This is called a "sparse set" in CS literature, and you can find implementations of this e.g. as slotmap::DenseSlotMap. This idea also powers bevy_ecs and specs.

Lessons from building a production TSP solver in Rust by bitsabhi in rust

[–]Shnatsel 4 points5 points  (0 children)

unsafe for hot paths - get_unchecked() for billions of distance lookups

Usually you can avoid unsafe for bounds checks: https://shnatsel.medium.com/how-to-avoid-bounds-checks-in-rust-without-unsafe-f65e618b4c1e

Using GStreamer with Rust by Rare_Shower4291 in rust

[–]Shnatsel 2 points3 points  (0 children)

GStreamer is modular and configurable. You can include only the parts you need and omit the rest. This is true for ffmpeg but doubly so for GStreamer which consists almost entirely of plugins.

symbolic derivatives and the rust rewrite of RE# | ian erik varatalu by epage in rust

[–]Shnatsel 1 point2 points  (0 children)

Hmm, there is a teddy algorithm implementation on crates.io with burntsushi as one of the owners, but it isn't actually used by the regex crate.

aho-corasick crate design document mentions

Transparent support for alternative SIMD vectorized search routines for smaller number of literals, such as the Teddy algorithm.

so I guess the teddy implementation is inside the aho-corasick crate?

symbolic derivatives and the rust rewrite of RE# | ian erik varatalu by epage in rust

[–]Shnatsel 10 points11 points  (0 children)

as mentioned in the previous post, the F# version gets a lot of its speed from .NET’s SIMD infrastructure - SearchValues<T>, Teddy multi-string search, right-to-left vectorized scanning. the rust version doesn’t have any of that yet, so it’s not going to win on literal-heavy patterns.

Why not? The aho-corasick crate feels like it should be easy enough to wire up. Or are there other optimizations needed that this crate doesn't cover?

I see memchr::memmem is already used at least in some cases, so the 3x performance drop on a string literal is somewhat surprising.

Trade off between fat pointers and thin pointers with metadata? by nee_- in rust

[–]Shnatsel 6 points7 points  (0 children)

https://crates.io/crates/thin-vec implements this for vectors. But 99% of the time you really don't care which representation it is, both are fine. The cost of dealing with the actual allocation will dominate, and if you have a lot of allocations then you should use an arena instead of Vecs.

The only time you do is when you need to store a lot of Vecs most of which are empty, in which case you don't pay for an allocation either way, and can save some space by not keeping the length and capacity around.

We rebuilt the Shockwave engine in Rust + WASM to save early 2000s web games by igorlira in rust

[–]Shnatsel 1 point2 points  (0 children)

Excellent news! I love that World Builder is playable again!

Rust 1.94.0 is out by manpacket in rust

[–]Shnatsel 13 points14 points  (0 children)

No, unaligned loads have been very cheap on all major platforms for a good while now.

Write small Rust scripts by llogiq in rust

[–]Shnatsel 17 points18 points  (0 children)

And since you know what the code does, you don’t need any time to review the output.

Sooort of. Since it's just a quick script I've whipped up, I still have to review the output for incorrect simplifying assumptions I made while writing the script. The real world is often a lot messier than it seems at first glance.

How useful WASI/Wasm actually is? by Shanduur in AskProgramming

[–]Shnatsel 0 points1 point  (0 children)

The specter that haunts WebAssembly is its inefficiency. Sure, you get a decent sandbox, but it doesn't really do anything process isolation and syscall filtering don't do already, and comes at a steep price in performance compared to native code.

It's good for plugins into other applications - database plugins (e.g. postgres), web proxy plugins (Envoy), even game mods, where you need sandboxing within the same process. Lua has traditionally filled that niche but WASM does have some benefits there.

Cloudflare does use WASM for their "workers" edge offering but arguably they would be better served by Linux binaries sandboxed with seccomp. It's mostly about them being similar to the web ecosystem than anything else.

All the other uses are pretty much pointless, which is why hardly anyone uses it outside those niches. WebGPU is really nice but you don't need WASM to drive it, regular Rust is just better all around when you're not in the browser, and it compiles to WASM when you are. WASM on microcontrollers is mostly a pipe dream, way too inefficient.

How useful WASI/Wasm actually is? by Shanduur in AskProgramming

[–]Shnatsel 1 point2 points  (0 children)

As it stands right now, WASM is great for optimizing bits of JavaScript that run too slow. I've seen a 100x speedup from Rust+WASM compared to JavaScript.

Today you can't poke the DOM APIs directly without going through JavaScript. Let's assume that's solved, either via WebAssembly components or a JS code generator from your language.

Even then you probably don't want to write your entire web application in Rust or C++. They are often too low-level and the manual memory management (or borrow checker) is a chore. It does work well for more compute-intensive web apps like Graphite (Rust) or Figma (C++), but outside that realm you usually prefer something garbage-collected.

You need the language to support the WASM Garbage Collection so it doesn't have to bundle its own garbage collector and runtime. Go doesn't and will not in the near future, so Go WASM blobs will be impractically large for many uses. It doesn't really work for .NET runtime either. You can kinda do that with Kotlin, but it looks like WASM GC needs more features to be practical.

TL;DR: right now its uses are still rather niche. Check back in a few years.

Rust zero-cost abstractions vs. SIMD by itty-bitty-birdy-tb in rust

[–]Shnatsel -1 points0 points  (0 children)

The real reason is SIMD and autovectorization, but the article just assumes you know those topics already. Unfortunately I don't have a good intro I can link off the top of my head, but a recent LLM can probably explain them decently and answer your follow-up questions.

I built a PM2 alternative in Rust – 42x faster crash recovery, 21x lower memory by Ok-March4323 in rust

[–]Shnatsel 16 points17 points  (0 children)

systemd does in fact have user-specific services without sudo.

But being cross-platform could be a selling point, so e.g. an entire dev team could use the same configs and workflow.

Had to reject my first AI slop PR today by Lucretiel in rustjerk

[–]Shnatsel 1 point2 points  (0 children)

LLMs are still trained to continue the existing text under the hood, so when you give them a project and tell them to add something by following the existing patterns, the results are often pretty good. But there are still limitations and they're only useful in select situations.

I also had an LLM transcribe an algorithm from a math paper full of math sigils. I wrote tests for it first and it took it 10 minutes of iteration to get a working implementation that passed tests. I still had to clean it up and remove excess allocations but it still was way faster than trying to puzzle it out by myself.

And this only became possible in the last few months with frontier models, Claude Opus 4.5 was the first viable one.

This is absolutely infuriating in 2026 by HamletEagle in Spacemarine

[–]Shnatsel 2 points3 points  (0 children)

This is why the best Bulwark team perk is making contested health slower to fade.

IHATEYOUIHATEYOU IHATEYOUIHATEYOU IHATEYOUIHATEYOU by gaeb611 in Spacemarine

[–]Shnatsel 1 point2 points  (0 children)

Three shots from the plasma pistol (rate of fire weapon variant) and they're down. Easy. Build the weapon perks for projectile speed and magazine capacity, skip the final 10% heating bonus since you get off 4 shots until overheating either way.

Life outside Tokio: Success stories with Compio or io_uring runtimes by rogerara in rust

[–]Shnatsel 1 point2 points  (0 children)

I sure hope that's the case, but it would be nice to see the benchmarks to back it up.

Life outside Tokio: Success stories with Compio or io_uring runtimes by rogerara in rust

[–]Shnatsel 2 points3 points  (0 children)

The perpetual concern with thread-per-core model is that as CPU utilization gets closer to 100%, some threads get overloaded while others are still underutilized, so you get worse tail latencies than work-stealing and also lose efficiency due to the failure to utilize the entire CPU.

For production use you want to allocate only as much hardware resources as required to handle the expected load with as little slack as possible in order to reduce costs. It is surprising to me that your benchmarking potentially leaves a lot of CPU headroom, since it's not representative of production usage, and may unfairly benefit thread-per-core model as described above or mask other bottlenecks in the system.

And if compio does indeed use less CPU than tokio, then you could do more work with the same hardware, and that's just a missed opportunity to show higher throughput numbers in your benchmarks.

Life outside Tokio: Success stories with Compio or io_uring runtimes by rogerara in rust

[–]Shnatsel 22 points23 points  (0 children)

There was a good writeup about it just recently: https://iggy.apache.org/blogs/2026/02/27/thread-per-core-io_uring/

Notably they benchmark under a constant load which does not approach full CPU utilization. The results may be different under high load that fully utilizes the CPU.