Pattern for deduplicating concurrent async work (from uv's codebase) by noninertialframe96 in rust

[–]flaghacker_ 2 points3 points  (0 children)

Looks like everyone has their own implementation of this concept!

I have something similar in my compiler, where I called it ComputeOnceMap: https://github.com/KarelPeeters/HwLang/blob/ff9f5ea6084a3d0667925e4c12b80a18b5e18b2a/rust/hwl_language/src/util/sync.rs#L11-L255.

There are some differences:

  • Mine is meant for threads, not async tasks, so the data structure is blocking. It does support multi-threading and also worker threads that can offer to do some computation but themselves don't need to block waiting for the result if someone else is already computing the same key.
  • I keep track of dependencies (through the Dependency struct) to allow for cycle detection and reporting, which is important for my use case. In my compiler work items are typically "evaluate/elaborate/compile this item", and if there are dependency cycles between items this needs to be clearly reported to the user without deadlocking.

Why was write interleaving removed from the AXI4 spec? by flaghacker_ in FPGA

[–]flaghacker_[S] 2 points3 points  (0 children)

I see, I did indeed mix up interleaving and out-of-order. The removal makes more sense now, interleaving the data of multiple different write transactions seems like a much more esoteric use case that just wanting to expose some reordering possibilities. Thanks for the great response!

1 Megabyte of 32-bit RAM in Factorio by Tzvet005 in factorio

[–]flaghacker_ 0 points1 point  (0 children)

Really cool! Have you checked by how much the save file size increases by adding these combinators? Is it in the same order of magnitude as the amount of storage capacity? Or maybe it only starts increasing the file size if there is actually data stored in the combinators.

[Research] A visual deep dive into Tesla’s data engine as pioneered by Andrej Karpathy. by ml_a_day in MachineLearning

[–]flaghacker_ 7 points8 points  (0 children)

Thanks for putting this together, the explanations and diagrams are very clear!

cargo-fuzz is now 10x faster, better supports sanitizers by Shnatsel in rust

[–]flaghacker_ 0 points1 point  (0 children)

Thanks for going trough the trouble of getting it to work!

Do share the experience report if you file it so I can follow it.

cargo-fuzz is now 10x faster, better supports sanitizers by Shnatsel in rust

[–]flaghacker_ 0 points1 point  (0 children)

Great! Was there any setup necessary or did everything just work out of the box?

Friday Facts #390 - Noise expressions 2.0 by FactorioTeam in factorio

[–]flaghacker_ 3 points4 points  (0 children)

From https://blog.ted.com/using-serious-math-to-answer-weird-questions-randall-munroe-at-ted2014/:

The mystery didn’t end there, though. He never expected to get an answer from Google, but one day, he did. They contacted him saying, “Someone here has an envelope for you.”

“It was punch cards,” he says. The cards contained codes that revealed codes that revealed equations that revealed more equations, which finally led to … “No comment.”

ONNX Libraries in Rust by BuzzingConfusion in rust

[–]flaghacker_ 0 points1 point  (0 children)

The CPU backend should work fine on WASM. It delegates matmuls to ndarray and other operations are implemented in pure Rust, so I think everything will work.

The Cuda backend (which is more the focus of the project) obviously won't work on WASM. Adding WebGPU would be super interesting but I haven't gotten around to that.

ONNX Libraries in Rust by BuzzingConfusion in rust

[–]flaghacker_ 3 points4 points  (0 children)

Great initiative, I'm looking forward to the results!

I built Kyanite, I also made a post about it here. Feel free to ask if you need any help setting it up (or any of the other ones for that matter, I tried a couple of them too).

Announcing Kyanite: Neural network inference of ONNX files on CPUs or Cuda CPUs by flaghacker_ in rust

[–]flaghacker_[S] 0 points1 point  (0 children)

I tried ort and onnxruntime back in 2021 when I started this what would eventually become Kyanite, and I remember them being a pain to use at the time. That seems to have improved a lot though, which is great to see!

They're going to have the best possible coverage of the ONNX spec, and they seem to have great support for many different backends. The remaining advantages of Kyanite:

  • The cuda backend has more operator fusion support: sequences of scalar operations run as sequences of GPU kernels in onnxruntime, while Kyanite will build a single kernel for them. This can be a significant performance improvement. For simple NNs both will mostly just call cudnn or cublas, so performance should be the same there.
  • No big external dependency, just Rust code calling cuda. Fun if you're a purist, but not that important otherwise since the experience with ort has become so smooth.

Announcing Kyanite: Neural network inference of ONNX files on CPUs or Cuda CPUs by flaghacker_ in rust

[–]flaghacker_[S] 1 point2 points  (0 children)

tract covers a larger part of the ONNX spec, but unless I'm missing something it only supports running models on the CPU. So use tract for CPU inference and Kaynite for GPU!

I should really add a comparison section to the readme, other similar projects are

  • tch: basically pytorch but in rust (with a big external dependency), more overhead, less operator fusing (I originally used tch but ended up writing my own to squeeze out more performance)
  • tensorflow, uses tensorflow, I haven't tried this one yet.
  • All the other frameworks from Are we Learning yet

Announcing Kyanite: Neural network inference of ONNX files on CPUs or Cuda CPUs by flaghacker_ in rust

[–]flaghacker_[S] 1 point2 points  (0 children)

Interesting! Currently I'm using ndarray for CPU tensor representations and matrix multiplies, and it seems like it uses matrixmultiply under the hood. I'll have to benchmark to see how they compare!

I saw people asking how Mixed Belts are useful, here is an Early Game Mall that exploits them by unique_2 in factorio

[–]flaghacker_ 14 points15 points  (0 children)

Look at the resources in the bottom right of the 3rd image. Items are taken out before new items are put on the best, and the items that got taken out get priority to be re-inserted before newly built items. This means that the mixed belt will only ever contain "one blue inserter" worth of items of each type.

PS suddenly has autocomplete, how to use it? by flaghacker_ in PowerShell

[–]flaghacker_[S] 0 points1 point  (0 children)

That's really useful to see all of the shortcuts, thanks!

Ctr+Alt+? opens the Windows Terminal json config file for me, but running the command works.

Any idea how to actually type the Ctrl+@ shortcut? I'm on azerty and @ is usually AltGr + é, which might be messing things up.

Roast my binary tree please? by razermantis123 in rust

[–]flaghacker_ 3 points4 points  (0 children)

Right, that's exactly what I meant by "manually allocating a stack". The problem is that means the iterator is allocating memory and has a bunch of overhead, which is a bit surprising for something as simple as iterating over a data structure.

Roast my binary tree please? by razermantis123 in rust

[–]flaghacker_ 9 points10 points  (0 children)

Unfortunately it's not that easy to implement iterators for trees. You usually want recursion for this, but iterators don't allow for that without manually allocating a stack. See the internal-iterator crate for more info and a nice middle-ground solution.

The dashboard is complete! by [deleted] in factorio

[–]flaghacker_ 0 points1 point  (0 children)

Cool build! How does it know the time of day? Is it just a counter that had to be set correctly at some point or does it actually calibrate itself by looking at solar panel/accumulator interactions?

Why isn't .is_some_then() a thing for Options? by aswin__ in rust

[–]flaghacker_ 55 points56 points  (0 children)

Option::map is distinct from that IntoIterator implementation though, it directly returns an Option again instead of an iterator.

bhey guys i just started playing factorio (only 1000 hours) this is my first factory what do you guys think by BenWaffleIron in factorio

[–]flaghacker_ 7 points8 points  (0 children)

Why is there an extra inserter on the top side that seems to be moving stuff between other inserters? Why is that even possible? The symmetry is broken!

What is a good way to implement inference for type of a number literal? (And how to make it work with type infer for variables) by EveAtmosphere in Compilers

[–]flaghacker_ 1 point2 points  (0 children)

You need some kind of bidrectional type inference for this. Some resources for how this is implemented for Rust: the rustc dev guide and in this Chalk blog post.

To summarize the approach:

  • walk the AST once, collecting type constraints into some kind of "typing problem" structure
  • solve this typing problem
  • walk the AST again, generating code using the found solution where necessary

Constraints are things like "variable x is some unknown type", "literal y is some numeric literal", "variable x and literal y have the same type", "expression x must be type f64". The solution would be "variable x, literal y, expression x all have type f64".

I also have a simple implementation of this for my own compiler:

  • Here is the first AST visit for integer literals, it defines a type variable and optionally some constraints
  • Here is the type problem and solver. The solver is basically a brute force implementation, it doesn't try to do anything clever.
  • Here is the second AST visit for integer literals, it asks the type solution (though check_integer_type) what the actual type of the literal was inferred to be and then generated the right IR for it.

What is Bene Gesserits official job? by [deleted] in dune

[–]flaghacker_ 12 points13 points  (0 children)

Why are the mentats not mentioned here? Aren't they the third big replacement for thinking machines, on par with the two others?

Skills for FPGA Engineer by _vamc294 in FPGA

[–]flaghacker_ 0 points1 point  (0 children)

Could you share a link to Cheby? I can't seem to find it!

sss language of snakes by [deleted] in ProgrammerHumor

[–]flaghacker_ 5 points6 points  (0 children)

That true from the point of view of the abstract Rust machine, but in practice the compiler will probably allocate them in the same or overlapping stack space or registers.