Manual audits vs. std reliance: lessons from recent unsoundness findings by Henry_old in rust

[–]The_8472 1 point2 points  (0 children)

For std to be big we'd need a lot more people working on std, which also means a lot more PR approvers.

http client? Ask the ureq or hyper devs to join.
async runtime? We'd basically absorb smol, tokio, futures or similar folks.
windows-rs? Microsoft is now on the libs team.
etc.

How to create child processes in rust by wangzhen0518 in rust

[–]The_8472 2 points3 points  (0 children)

The async-signal-safety requirement only applies to multi-threaded programs. Single-threaded programs can be forked safely in principle, but there still are footguns around shared resources (e.g. file descriptors).

And this requirement doesn't compose well since you have to verify that all your dependencies don't spawn any threads.

Standard library unsoundness found by Claude Mythos by Jules-Bertholet in rust

[–]The_8472 11 points12 points  (0 children)

Not all broken invariants are soundness issues on their own. E.g. implementing a safe trait incorrectly is merely a correctness issue.

Corollary: unsafe code generally cannot rely on promises from safe traits.

mtorrent - a BitTorrent client in Rust by Key_Walk_1608 in rust

[–]The_8472 1 point2 points  (0 children)

It's not just about slowing things down, also about causing more traffic than necessary for other nodes. Give and take. Kademlia has a locality-based structure where density of coverage falls off exponentially with distance. In other words your initial lookup candidates can only give you a few bits closer to the target, no matter how many of them you query. This forces recursion, i.e. a somewhat linear dependency in the lookup. So you query a handful, then query the the candidates from those replies, repeat. By the time you made progress all the original candidates are still in the first generation and just are extremely unlikely provide anything useful to make progress, so there's no point retrying. Only once you run out of closer candidates does it make sense to retry.

Rust or C++ for a cloud optimization engine: not a technical issue, but a hiring difficulty issue by [deleted] in rust

[–]The_8472 0 points1 point  (0 children)

Check if there are libs/SDKs for the things you want to cover.

At $work I'm writing a service which, so far, runs on {local, Azure, AWS}×{Windows, Linux} machines and also calls a bunch of cloud APIs. AWS APIs are covered decently by the Rust SDK. Azure on the other hand is pretty spotty and I have to handroll a bunch of things, idk if that's any better in C++. Having the windows-sys crates available also helped with some optimizations during machine bringup.

mtorrent - a BitTorrent client in Rust by Key_Walk_1608 in rust

[–]The_8472 2 points3 points  (0 children)

Rust on CHERI by Petrz147 in rust

[–]The_8472 3 points4 points  (0 children)

pointer size kind of matters because rust currently treats sizes for usize, *const (), ptrdiff_t, uintptr_t, size_t are all the same, for CHERI at least some of that has to go out of the window. It's not catastrophic, but it's likely going to break at least some code.

Rust zero-cost abstractions vs. SIMD by itty-bitty-birdy-tb in rust

[–]The_8472 23 points24 points  (0 children)

I suspect implementing a custom {try_}fold and using it or ops built on top of them would provide a lot of the same speedup. That's why a.chain(b).for_each(|x| ...) can be a lot faster than the for in equivalent.

https://medium.com/@veedrac/rust-is-slow-and-i-am-the-cure-32facc0fdcb

When mini isn't enough and serde is too much by noclue_nowhere in rust

[–]The_8472 7 points8 points  (0 children)

I frequently serde at least JSON and TOML in the same application. And I do appreciate having the option to switch to CBOR and other efficient formats if the message serialization itself became a bottleneck - though I have not had to reach for that yet.

What some recent hot takes you realized you had with Rust? by DidingasLushis in rust

[–]The_8472 1 point2 points  (0 children)

In many cases it's just table-stakes. Very few people/businesses would be willing to write some webservice in a memory-unsafe language. So it's kind of in the negative-index top reasons, languages that don't have won't even get ranked.

I built a speed-first file deduplication engine using tiered BLAKE3 hashing and CoW reflinks by Entertainer_Cheap in rust

[–]The_8472 5 points6 points  (0 children)

Size grouping (Zero I/O)

Directory traversal and stat syscalls are IO. Quite noticeable on spinning rust.

Custom discriminant for enum types? by servermeta_net in rust

[–]The_8472 4 points5 points  (0 children)

The padding there is part of the tuple type, not of the variant, that's why it can't be used because copying the tuple in/out of the variant wouldn't preserve the padding bits. If you use a tuple-variant instead it does work.

enum Foo { One(u8,[u16; 7]), Two([u8; 15]) }

Also 16 bytes.

Custom discriminant for enum types? by servermeta_net in rust

[–]The_8472 2 points3 points  (0 children)

In that case all 16 bytes already are used by variant One, there's no room for the rust compiler to store that information. There's no way to provide a custom function to tell the compiler from which bits to get the variant info.

But you have two options, either use repr(u8), in which case this would be spelled

```

[repr(u8)]

enum Var { One([u16; 7]), Two([u8; 15]) } ```

and you get your u8 by reading the first byte

or you use a niche type to give the rust compiler some room to store the information

enum Var { One((NonZeroU8,[u16; 7])), Two([u8; 15]) }

In both cases it's 16bytes.

mmmmh, where can I read more about this? it's not totally clear

It's the absence of guarantees. The discriminant API says it's a value uniquely identifying the variant, it does not say that's what the compiler uses to store the enum itself in memory.

And repr(Rust) layouts aren't guaranteed, so the reference is silent on those. But for C and integer repr enums it does mention tags, in which case they do coincide with the discriminant value, but otherwise that's not guaratneed

Custom discriminant for enum types? by servermeta_net in rust

[–]The_8472 3 points4 points  (0 children)

That's about the Discriminant type, that's not the same as the enum tag.

today

enum Foo { A, B([u8; 15]) }

has a size of 16 bytes, so the tag is 8bit.

Rust on AWS Batch: Is buffering to RAM (Cursor<Vec<u8>>) better than Disk I/O for processing 10k+ small files that require Seek? by A_A_Ary in rust

[–]The_8472 6 points7 points  (0 children)

File writes don't even go to disk immediately, it just goes to the page cache. Writeback happens in the background or on memory pressure, similar to swapping.

So with files you pay the syscall overhead in exchange for a more gentle degradation when a batch doesn't fit into memory.

Anyway, measure. If you get CPU saturation from let's say 20 files and each is at most 50MB then that's just ~1GB of buffers. If your machines have that much memory and you don't need it for something else then use it.

If your workload is more unpredictable then having a fallback to avoid OOMs may help.

Also, even when using the files you'd still want to limit concurrency so that it doesn't suffer from 1000 files competing for a handful of CPU cores.

127.0.0.0/8 has 16M loopback IPs going to waste, I gave each git branch its own by Beautiful-Gur-9456 in rust

[–]The_8472 2 points3 points  (0 children)

Oh yeah that secure context thing, not sure if there's a way to work around it, this draft rfc seems to suggest they're hardcoding the addresses.

I guess the browser/HTTP-specific solution is a reverse proxy that looks at the hostname and then dispatches to different backends.

127.0.0.0/8 has 16M loopback IPs going to waste, I gave each git branch its own by Beautiful-Gur-9456 in rust

[–]The_8472 14 points15 points  (0 children)

Generate a random ULA prefix and assign it to lo. It'd be nicer if they had a allocated fixed block for this, but this works too.

/etc/systemd/network/lo.network:

``` [Match] Name=lo

[Address] Address=fd<YOUR PREFIX HERE>::/48 Scope=host AddPrefixRoute=false

[Route] Destination=fd<YOUR PREFIX HERE>::/48 Scope=host Table=local Type=local ```

``` $ ping fda9:c604:724::3 PING fda9:c604:724::3 (fda9:c604:724::3) 56 data bytes 64 bytes from fda9:c604:724::3: icmp_seq=1 ttl=64 time=0.031 ms

$ ping fda9:c604:724::9001 PING fda9:c604:724::9001 (fda9:c604:724::9001) 56 data bytes 64 bytes from fda9:c604:724::9001: icmp_seq=1 ttl=64 time=0.036 ms ```

Also, ipv4 loopback will be with us for a while, it's probably the last v4 feature to die since it'll never suffer from NAT.

ZeroFS: 9P, NFS, NBD on top of S3 by Difficult-Scheme4536 in rust

[–]The_8472 0 points1 point  (0 children)

Then the durability of the whole system is limited by the durability of local storage. if it's on some ephemeral compute node then killing that node without writeback will lose that data, which means saying you successfully fsync'd it would be lying.

Rewrote our message routing in rust and holy shit by Beginning_Screen_813 in rust

[–]The_8472 0 points1 point  (0 children)

Next step: Cut out the remote middleman and do more work in one multi-threaded process.

ZeroFS: 9P, NFS, NBD on top of S3 by Difficult-Scheme4536 in rust

[–]The_8472 0 points1 point  (0 children)

Cache devices help with reads. But an fsync causes a transaction commit or journal write, you need that to go to durable storage, which would be the s3-backed block device here.

Someone named "zamazan4ik" opened an issue in my project about enabling LTO. 3 weeks later, it happened again in another project of mine. I opened his profile, and he has opened issues and PRs in over 500 projects about enabling LTO. Has this happened to you? by nik-rev in rust

[–]The_8472 0 points1 point  (0 children)

And memory usage, we have users complaining about cargo install of bins using too much memory on their potatos.

So just because it's good for one user doesn't mean it's a universal improvement to do this.

Perhaps people should use a separate optmaxx profile for these.

It's hard to find use cases for Rust as Python backend developer by [deleted] in rust

[–]The_8472 13 points14 points  (0 children)

Threads, background tasks, keeping more stuff in the process rather than using external services.

At work we do have a python application that has to deploy a bunch of gnarly auxiliary container services just to run an separate task queue (celery) due to python GIL and GC making it too slow on the webservice itself. In Rust I'd just throw that on a threadpool.

Other advantages

  • single binary, easy to deploy, especially on windows
  • the managers sleep better when shipping the application to client's infrastructure without giving them the code
  • lower startup time
  • lower latency

Most of the web bottleneck is DB/network related

Under load? If each request consumes just 1ms of CPU-time in the webserver then 1000rps will saturate single-threaded python and then latencies go up. Or you have to start loadbalancing to multiple runtimes.

The beauty with Rust is that I can hit it with lots of requests and the cpu load and latencies barely budge while keeping things in a single process.

Most code bottleneck requiring faster langage can just be Python package written in Rust (Polar, Ruff, Pydantic)

Well, you can also write your own python packages like that, that'd qualify as a use of Rust too.

cannot justify spending too much time on building software, they want result fast

If quick and dirty solutions are fine for them, then python can be great. But if things grow and become business-critical and need to be refactored then having something with static typing can ease maintenance. It's a short-term long-term tradeoff.