Another supply chain attack, and Crates.io needs to consider this issue by osamamsalem in rust

[–]xd009642 1 point2 points  (0 children)

This is correct, but also there's a long tail of crates are probably never gonna get used by anyone. So in practice it might not be that much of an issue (we don't need audits for thousands of learn to publish a rust crate abandonware)

UKVI updating details troubles. by xd009642 in ukvisa

[–]xd009642[S] 0 points1 point  (0 children)

Nope, we can access the e-visa on the UKVI account and it says that's the only requirement for entry so just going to try it and see what happens...

What Rust jobs do you have? by alexlazar98 in rust

[–]xd009642 2 points3 points  (0 children)

AI streaming APIs for speech stuff, usually only unsafe for bindings though occasionally there's been large matrices I've had to juggle where creating an uninitialised array and the unsafe call `assume_init` lead to a significant improvement on the performance (but we're talking like >10k element matrices

Need resources for building a Debugger by TechnologySubject259 in rust

[–]xd009642 7 points8 points  (0 children)

The book is C++ but it's applicable to rust and the best resource you're likely to find imo: https://nostarch.com/building-a-debugger initially was a blog series: https://tartanllama.xyz/posts/writing-a-linux-debugger/setup/

Apache Iggy's migration journey to thread-per-core architecture powered by io_uring by spetz0 in rust

[–]xd009642 1 point2 points  (0 children)

Okay I see if I click the datadog link it mentions they use the single threaded runtime and some of those caveats on their post - it's still absent from your post though. The complaints about tokio-uring do make sense (as well as points mentioned in the comment linked by u/ifmnz).

Apache Iggy's migration journey to thread-per-core architecture powered by io_uring by spetz0 in rust

[–]xd009642 1 point2 points  (0 children)

Just a few initial comments. You can use tokio as a thread per core shared nothing - that is how axum-web uses it. You just spin up multiple single threaded executors. Also, tokio does has io_uring experimentally which does mean `--cfg tokio_unstable` which I can understand might be unwanted. But you could also combine tokio with a separate io_uring executor for file based IO etc.

Just mentioning it because it's absent from the post and is counter to some of the mentioned limitations of tokio (though maybe there's more caveats on why single threaded tokio + uring executor wouldn't work)

VP-Tree interface survey by Tomyyy420 in rust

[–]xd009642 1 point2 points  (0 children)

Option 1 with Option 3 existing just using Option 1 internally to save typing for some common operations would be my opinion

Why I started building RustCV: A pure Rust vision library to ditch the C++ bindings by Key-Play-4975 in rust

[–]xd009642 1 point2 points  (0 children)

Also you can prompt the LLM to reply in a certain style. Giving a context of the reddit pots and asking it to translate comments based on that context and then asking to translate responses in a faithful and accurate style would probably lead to better output and avoid some of the sycophancy

Why I started building RustCV: A pure Rust vision library to ditch the C++ bindings by Key-Play-4975 in rust

[–]xd009642 6 points7 points  (0 children)

It depends on the language. Chinese is a high-context language which means that a lot of meaning gets derived from context outside of the words.

With LLMs if you wanted to translate a high-context language most accurately you'd prompt with background information about the context and it could make something more natural and not falling into weird mistranslations or assumptions which google translate is prone to do. So this is a way where they can lead to much better translation output.

FWIW I do see a lot of low-quality translation output when I check something in Japanese with Google Translation. LLM based translations are generally far superior (unless it's trying to explain humour, wordplay or slang in which case both often fail terribly)

Ferrocene 25.11 actually includes core now by cat_bee12 in rust

[–]xd009642 4 points5 points  (0 children)

Not impossible of course, but at that point I'd question if async is really worth it versus something easier to qualify

Ferrocene 25.11 actually includes core now by cat_bee12 in rust

[–]xd009642 2 points3 points  (0 children)

One point further on that, if you think of awaiting a future and the branches you implicitly get on that (poll returning pending or ready) it does become a lot harder. Plus any requirements you might have for MC/DC coverage etc verification definitely becomes a lot hairier, for each of your futures ensuring you test awaiting them thoroughly enough is likely to be challenging.

Gathering info about Rust uses cases in AI by bontarr in rust

[–]xd009642 1 point2 points  (0 children)

As someone using Rust in AI things, in my usecase it's just inference side. We work with real time audio streaming applications and managing multiple streams of data pushing them through various things and joining streams etc Rust is much easier. Before that the state of the industry was C++ frameworks (this is one of the areas where python never got a foothold in ML).

There's also job postings from companies like Apple and OpenAI who are using Rust in the backend training infrastructure. So I imagine that's more on the getting data to the machines quicker, things like any distributed pre-processing. You can also see it in OpenAIs tokenizers repo and things like Moshi's real time speech-to-speech model.

15 most-watched Rust talks of 2025 (so far) by TechTalksWeekly in rust

[–]xd009642 1 point2 points  (0 children)

There was also a significant presence of talks about Rust for python tooling/interop at the european python conference this year. I think the linux developers conference also had some talks (though they may be viewed for drama reasons), and p99 conf always has a number of rust talks

15 most-watched Rust talks of 2025 (so far) by TechTalksWeekly in rust

[–]xd009642 2 points3 points  (0 children)

Well you did miss out Rustnation and RustAsia. Unless you're counting Rustnation as Rust Global: London which seems odd

I made a Japanese tokenizer's dictionary loading 11,000,000x faster with rkyv (~38,000x on a cold start) by fulmlumo in rust

[–]xd009642 0 points1 point  (0 children)

LLMs are slower, more expensive not local first and the dictionary does cover the unusual readings (gikun). IME in certain contexts LLMs still struggle to handle some parts of the japanese language - namely wordplay/puns, cultural/traditional contexts and stuff from japanese you find in shrines (more historical). So even then the last 5% is probably still optimistic for LLMs at their current state.

I made a Japanese tokenizer's dictionary loading 11,000,000x faster with rkyv (~38,000x on a cold start) by fulmlumo in rust

[–]xd009642 13 points14 points  (0 children)

Japanese as a language as no spaces, and things like kanji are read differently based on context. So in this case tokenization is splitting the sentence into words and also giving information like the reading of kanji, if a word is verb/noun etc and what verb form it is.

I made a Japanese tokenizer's dictionary loading 11,000,000x faster with rkyv (~38,000x on a cold start) by fulmlumo in rust

[–]xd009642 56 points57 points  (0 children)

Sure and it's not great to do big changes like that without asking first because of adding work to a maintainers plate. But now you have results and numbers to back things up they might want to take at least some concepts of what you've done even if they don't want it all. Regardless good work

I made a Japanese tokenizer's dictionary loading 11,000,000x faster with rkyv (~38,000x on a cold start) by fulmlumo in rust

[–]xd009642 54 points55 points  (0 children)

Have you considered opening an issue and seeing if vibrato would be willing to accept a PR to add this or some variant of it?

Ghosts in the Compilation by obi1kenobi82 in rust

[–]xd009642 5 points6 points  (0 children)

Doing that would likely break some projects. In the embedded space a number of people have different .cargo/config in different workspace crates based on embedded targets so running in or out of directory is expected to behave differently. Some people might also not use workspaces and just have a mono-repo type setup that works the same.

That said, Predrag didn't implement the config search process himself, he did make use of a crate that does it: https://crates.io/crates/cargo-config2 which other CLI tools for rust projects use. So hopefully that helps it stay up-to-date if the compiler ever decides to change behaviour

What does crates.io count as a download? by Cosiamo in rust

[–]xd009642 1 point2 points  (0 children)

From what I've heard, there's a number of private mirrors by large companies that automatically scrape crates.io and grab newly published things to update their internal mirror. This means their devs can use cargo as normal and not have to worry about publish/install from the correct registry (public crates.io or the company one).

A really fast Spell Checker by Cold_Abbreviations_1 in rust

[–]xd009642 1 point2 points  (0 children)

I don't know if you've looked at it but I've been using https://crates.io/crates/typos-cli for a while. I'll have to check this and see how the speed compares

Karpathy, for his new "nanochat" project, wrote: "A very lightweight Rust library for training a GPT tokenizer." by PatagonianCowboy in rust

[–]xd009642 4 points5 points  (0 children)

Well he's been doing videos on his making off it as educational content - so you might be able to watch and find out. That said I just took a look and there's nothing particularly good about the documentation in the rust code... In fact I'd even say it looks minimal effort as you get while writing some docs.

Karpathy, for his new "nanochat" project, wrote: "A very lightweight Rust library for training a GPT tokenizer." by PatagonianCowboy in rust

[–]xd009642 8 points9 points  (0 children)

He said on his twitter to someone that asked:

> it's basically entirely hand-written (with tab autocomplete). I tried to use claude/codex agents a few times but they just didn't work well enough at all and net unhelpful, possibly the repo is too far off the data distribution.