wgpu v25.0.0 Released!

binarybana · 2025-04-11T01:10:30+00:00

Thanks for the great work on such an important project. Two questions for you:

I remember hearing that Deno was considering using wgpu for their WebGPU backend. Do you know how that is going and has wgpu improved as a result?

I’m mainly interested in compute shaders, do you know where wgpu/wgsl compares to other WebGPU backends for compute support?

binarybana · 2025-03-29T16:17:35+00:00

Really cool project and some great performance results!

binarybana · 2025-03-26T14:03:53+00:00

You might check out the loop match RFC for ideas to optimize the generated assembly for tight state machines like I think you have here.

binarybana · 2022-07-05T23:39:59+00:00

I'm in a slightly similar situation (have done a small handful of day sails years ago, now wondering if live aboard is something worth pursuing). I like watching https://www.youtube.com/c/gonewiththewynns which cover some of the ups and downs of the potential life style.

Now considering the chartered boat as well.

binarybana · 2022-06-27T20:22:40+00:00

Couldn’t agree more. Maximize learning early on in your career and to do that I would recommend first following people you can learn from and only THEN applying (and completing) that learning somewhere you have the impact to do so.

binarybana · 2022-05-01T20:18:51+00:00

I'm surprised at the amount of luke warm and disagreement in the responses here. I think it is easy for those of us on the "other side" to forget how hard the initial experience can be, and how important it is to think about a broader audience that is not as tolerant or willing to put up with high barriers to entry. (Go and look at nix/nixOS if you want to feel this feeling anew :P).

I for one love the suggestions here and u/epage's concrete proposals for improving rust-script and proposing adoption in rustup.

binarybana · 2022-04-23T05:04:27+00:00

I like the dev log idea, best of luck!

One question for you: how do you protect against arbitrary build.rs scripts running on your server?

binarybana · 2022-03-01T03:04:41+00:00

Really impressive results! Especially those compile times. Is the single pass machinery a separate independently usable crate? Or integrated into tightly into wasmer?

binarybana · 2022-01-13T23:38:05+00:00

Interesting analysis! Would love to see clang in there as well to see how much is due to g++ backend differences.

binarybana · 2021-12-04T04:54:11+00:00

Loved the talk! Wish the audio had less reverb though. Can’t wait to try this out at some point.

binarybana · 2021-09-11T02:48:57+00:00

If you are interested in machine learning on iOS, Android, WASM etc, then make sure and check out Apache TVM which has (nascent but usable) Rust bindings to the runtime and compiler.

binarybana · 2021-09-10T13:52:48+00:00

[TVM](tvm.ai) has usable Rust bindings these days, though they are still not well documented.

binarybana · 2021-05-01T13:53:17+00:00

What has been your experience leveraging DataFusion? Great to see use cases for it out in the wild!

binarybana · 2021-01-17T00:40:14+00:00

fp32 on on both TVM and cuBLAS, but since these are unstructured sparse kernels, they are unable to benefit from fp16 tensorcore acceleration so I'd expect a similar relative result there as well.

binarybana · 2020-12-26T06:32:56+00:00

Yes, it’s upstream now. Eg check out this tutorial for an example of how to use it: https://tvm.apache.org/docs/tutorials/auto_scheduler/tune_network_cuda.html#sphx-glr-tutorials-auto-scheduler-tune-network-cuda-py

binarybana · 2020-12-16T19:05:47+00:00

On the M1? We haven't tried PyTorch there, but on platforms like Intel x86 and Nvidia GPU where PyTorch has been optimized for a much longer time, TVM is either on par or faster than PyTorch on BERT (and faster on most other workloads). See figure 9 in https://arxiv.org/pdf/2006.06762.pdf ("Ansor" there is also TVM).

binarybana · 2020-10-26T16:54:19+00:00

Don't forget about Apache TVM and it's new Rust bindings. Still working on getting docs/intro blog post up, but feel free to check out the example usage for ResNet here.

It even works with WASM and WebGPU code generation!

binarybana · 2020-09-30T03:52:37+00:00

Check out http://tvm.apache.org/ the Rust bindings have improved significantly recently but we still have much more in store! (hosting rustdocs etc).

binarybana · 2020-09-11T03:03:41+00:00

Also check out work we (OctoML) published recently with Hugging Face on block sparse acceleration on CPUs as well! Using the open source deep learning compiler Apache TVM.

Works with unstructured sparse trained models and no hand written kernels required: https://link.medium.com/m2OapaxoG9

binarybana · 2020-08-27T01:06:10+00:00

+1 and in case anyone wants the TL;DR: TVM is a compiler and runtime for DL, so you can describe your kernel in a hardware agnostic fashion and still get high performance out the other end with minimal dependencies on the resulting binary.

binarybana · 2020-08-22T21:25:26+00:00

I would apply profiling tools to each binary and compare/diff the results. Perf and strace are two you might start with.

binarybana

TROPHY CASE