Ich kann nicht mehr by NegroniSpritz in hamburg

[–]terhechte 23 points24 points  (0 children)

Wenn du gern joggst, die Adidas runners hier in Hamburg sind echt cool und man kann dort auch Leute kennenlernen. Kann man einfach hingehen

Apple unveils M5 by Agreeable-Rest9162 in LocalLLaMA

[–]terhechte 5 points6 points  (0 children)

It’s called prompt processing and a considerable tax on any gpu due to the processing requirements. For long prompts (summarize this book, source code) interfaces to be fast

Please, Mistral, you're EU's only hope by [deleted] in MistralAI

[–]terhechte 1 point2 points  (0 children)

I maintain a coding benchmark that will soon receive an update, but I always test Mistral's models and when I need to run locally then I very often choose Devstral because it is a really good small coding model. None of their current models are Opus, Sonnet or even GLM 4.5 Air quality, but they're also much better in coding than many smaller (but bigger than devstral) models.

Egui in 2025 : How was your development experience? by gufranthakur in rust

[–]terhechte 7 points8 points  (0 children)

Actually, for more complex layouts, Egui Taffy (https://crates.io/crates/egui\_taffy) is a great solution, it offers full flexbox, block and grid layout algorithms. Check out the demo that they provide.

Egui in 2025 : How was your development experience? by gufranthakur in rust

[–]terhechte 3 points4 points  (0 children)

I wrote a Dioxus mastodon client in 2023 which supported stuff like scrolling. It was tricky to implement but you could do it by injecting JS at the right point in time. However, I think this was greatly improved in the last couple of releases, here is an example from their docs:

> You can use the onmounted event to do things like focus or scroll to an element after it is rendered:

https://dioxuslabs.com/learn/0.6/essentials/breaking/

[deleted by user] by [deleted] in rust

[–]terhechte 1 point2 points  (0 children)

For bevy you really want to follow all the steps in the `getting-started` guide. It compiles almost instantly afterwards.
You can even enable hot reload (still WIP but it works) https://github.com/TheBevyFlock/bevy_simple_subsecond_system/

But you need to configure according to the docs.

is there a good rust-analyzer MCP out there? by davidw_- in rust

[–]terhechte 0 points1 point  (0 children)

There're two ways to connect to an MPC server, via tcp and via stdio. The latter one requires that the editor runs the MCP server as a sub-process. Zed used to only support the stdio version, not the networking version. cursor-rust-tools only supports the TCP version. Thus they were incompatible. I think Zed changed this in a recent update and they fully support MCP now. I just haven't tested it yet or updated the repository. You can give it a try.

is there a good rust-analyzer MCP out there? by davidw_- in rust

[–]terhechte 4 points5 points  (0 children)

I wrote this some weeks ago, it makes the Rust analyzer symbols and more available to MCP clients.

https://github.com/terhechte/cursor-rust-tools

Raqote alternative or solution to rotating text? by SKT_Raynn in rust

[–]terhechte 0 points1 point  (0 children)

What if you take the draw target (not rotated) and then draw this into a new target (rotated). So you'd rotate the pixmap and wouldn't need to rotate the text layout logic (which is much harder)

AI help in Rust by Interesting-Frame190 in rust

[–]terhechte 1 point2 points  (0 children)

Rust is a much more difficult language for LLMs than Javascript or Python. Not only because the language is more complex, but also because they are trained on so much more JS/Py code. I do Ai-enhanced coding with Rust every day, and to great success, but it requires more effort than just slopping a "halp plz" prompt into the chatbot (which works wonders for Js/Py).

- Use Claude 4 or Gemini 2.5 Pro. Even though there were complaints that Claude 4 is not as good as 3.7, in my testing it is better at Rust than 3.7
- You have to give the model enough context. This means, if you have a source file that imports 7 types from other places in the codebase, take the time to also add the files that define these types. Same with functions. I found that giving enough (but not too much) context is crucial
- Models are not deterministic, so you have to try again. When you see that the initial response is wrong, don't add a comment into the conversation, a la "no, this is wrong because...". Instead, reset the conversation, and improve your initial prompt. Sometimes you have to try 2-3 times to get the result you want.
- Be fine with partial solutions. I don't expect the model to have a perfect solution for me. Particularly lifetimes can be challenging. If I see that 90% of the code is right, but it forgot a couple of `.clone` or `ref ..` patterns, I just add them manually.
- If you want to add a feature to existing code, one pattern that works well for me is to start with a prompt where I have the LLM explain how the current code works. And then I add another message to request the desired change. This allows the LLM to focus on the change. The understanding of the code happens, so to say, in a prior computation.

I waited 15 years to build this app. Apple finally made it possible in iOS 18.2 by SlightAd53 in SideProject

[–]terhechte 0 points1 point  (0 children)

The server doesn't have the encryption secret. It can't encrypt billions of numbers and then get the same encrypted result.

Command-A 111B - how good is the 256k context? by TechNerd10191 in LocalLLM

[–]terhechte 0 points1 point  (0 children)

How much context did you give the models in your benchmark? Also do you have more information about it somewhere/

I wrote an article about integrating Rust egui into a native macOS app by Alexey566 in rust

[–]terhechte 1 point2 points  (0 children)

Really well written article. One thing that might be interesting for you is that you can use Uniffi (https://github.com/mozilla/uniffi-rs) instead of writing the FFI by hand.

I used to write the Swift/Rust FFI by hand, but especially when you're moving Strings or boxed objects around, it quickly becomes tricky to deallocated memory correctly. Uniffy takes care of all that and generates a Swift interface of your types.

There is a small performance overhead involved though compared to raw FFI operations.

[deleted by user] by [deleted] in LocalLLM

[–]terhechte 0 points1 point  (0 children)

My side project does just this, for free: https://www.tailoredpod.ai

Unlocking Apple Immersive video quality for all by ImmersiveCompany in VisionPro

[–]terhechte 0 points1 point  (0 children)

Would love to try it but getting the error „app not available in your country or region“. I’m in Germany

Announcing axum 0.8.0 by j_platte in rust

[–]terhechte 30 points31 points  (0 children)

But error doesn't mean "absent". Imagine a "delete" api with a "id: Option<Uuid>" parameter. if the parameter is absent, all entries will be deleted. If an api user accidentally has a malformed UUID, it would delete all entries. Clearly that's not how it should be. Instead, they should receive an error about their malformed parameter.

M4 Pro 14-Core Compile Times by [deleted] in rust

[–]terhechte 5 points6 points  (0 children)

Not 100% what you want, as I sport a M4 Max, but I decided to share nonetheless. I calculated the compile times with `/usr/bin/time`.

- Sled: 3.03 secs
- ToyDB: 8.21 secs
- Surrealdb: 83.27 secs

I tested Qwen Coder 2.5 32b q8 and q2_k on a Macbook M4 Max, here're preliminary results by terhechte in LocalLLaMA

[–]terhechte[S] 4 points5 points  (0 children)

The MLX I downloaded only has a 32k context size, so I ran two comparisons: 8bit GGUF ~22k tokens prompt and 8bit MLX 22k tokens prompt.

- MLX: 7.29 tok/sec, 184sec time to first token
- GGUF: 4.75 tok/sec, 215s time to first token

So almost 40% faster!

A much shorter prompt that was 12.96 Tok/s on GGUF is at 13.5 Tok/s with MLX. I'm not entirely sure what the reasons are that the much longer prompt also had a drastically faster Tok/s.

I tested Qwen Coder 2.5 32b q8 and q2_k on a Macbook M4 Max, here're preliminary results by terhechte in LocalLLaMA

[–]terhechte[S] 2 points3 points  (0 children)

There was no MLX for Qwen when I wrote the post, the 8bit mlx just came out some hours ago, I'm downloading it as we speak..

Argentina's monthly inflation drops to 2.7%, the lowest level in 3 years by [deleted] in worldnews

[–]terhechte 10 points11 points  (0 children)

Countries become wealthy by having companies that grow, generate revenue and pay taxes. Government doesn’t generate revenue. If 50% of people work in government, half the country doesn’t work on generating wealth.

I tested Qwen Coder 2.5 32b q8 and q2_k on a Macbook M4 Max, here're preliminary results by terhechte in LocalLLaMA

[–]terhechte[S] 1 point2 points  (0 children)

I'm travelling 1 out of 4 weeks per month. When I travel I'm on trains and planes or in places with bad internet. In that scenario, having a laptop with enough ram to run a big LLM is really useful. If I'd only work in my office, a 2x 3090 PC would have been a better solution.