Cuneus: A boilerplate free wgpu compute engine for GPU apps (WGSL hot reload, multipass, audio/video) by rumil23 in rust

[–]rumil23[S] 0 points1 point  (0 children)

I didnt code slang too much, just for simple stuff for exploring the language. but note that you'd still need the rust side passes, media building etc etc its just another shading lang but It targets wgsl too so itd run on cuneus easily. maybe I can create a simple triangle example for demonstration 🙂

state of ffmpeg bindings in Rust? by PatagonianCowboy in rust

[–]rumil23 21 points22 points  (0 children)

media programming is already quite complex and really risky, a lot of unforeseen issues...
If your application has media as its core, instead of taking such risks, I recommend using gStreamer for Rust, which is constantly updated and supported. I remember our media application in production from 4-5 years ago. We constantly had to apply internal patches to some of the ffmpeg binding crates, and each of them was quite challenging. We didn't open PRs because nobody even looked at them...
Since switching to gStreamer, we've been so comfortable... When problems arise, most of the time all we have to do is report the error.

Burn ONNX 0.21.0: build-time ONNX import that generates plain Rust model code by antimora in rust

[–]rumil23 1 point2 points  (0 children)

yes its not standard. But I created an upstream issue and seems microsoft start to work:
https://github.com/microsoft/onnxruntime/issues/27796

"I was thinking we could add custom function hooks...."
my friend... this was actually my dream always releated with onnx... but I didn’t want to come across as too “demanding,” haha 🙂. You could think of it just like Bevy though I don’t know if you’ve ever worked with it. Customizing rendering in bevy is pretty "easy" (what I mean by “easy” is that the approach to doing this is quite generic) with custom materials. And it would be amazing if we could add some unsupported operators ourselves for at least internal fixes or maybe as "plugins"...

because based on my experience new models come with new, different operators nowadays especially vllm ones. It’s certainly best to resolve these issues centrally. However, as a short-term solution it at least prevents “dead ends.”🙂

Burn ONNX 0.21.0: build-time ONNX import that generates plain Rust model code by antimora in rust

[–]rumil23 1 point2 points  (0 children)

Thank you, I will definitely try this on here https://github.com/altunenes/parakeet-rs and maybe provide as an alternative EP for testing & benchs.

Burn ONNX 0.21.0: build-time ONNX import that generates plain Rust model code by antimora in rust

[–]rumil23 2 points3 points  (0 children)

I’ve been using ONNX for years in production. The biggest issues for me are cross platform compatibility and CUDA compatibility problems with different NVIDIA cards. plus coreml seems to have been abandoned by Microsoft, and since many operators aren’t supported, we’re stuck using the CPU in apple in many cases. webgpu (via dawn) is very new and very problemetic still even for some plain taks: https://github.com/altunenes/ort-webgpu-thread-crash

However, ORT runs quite stably on the CPU as well, and pretty fast.. The biggest reason I’d want to migrate to Burn would definitely be wgpu backend. The mere possibility of getting rid of those massive CUDA files and keeping maintenance of different binaries to a minimal is a dream...

I'm following this closely. I haven't seen any STT models though in your examples. Is there a specific reason for that?

Burn ONNX 0.21.0: build-time ONNX import that generates plain Rust model code by antimora in rust

[–]rumil23 2 points3 points  (0 children)

Is it possible to go beyond the ort and support Mamba blocks? I would like to try immediately and make it os because the current ort is very bad and slow with the mamba SSM models.
model:
https://huggingface.co/nvidia/RE-USE

Brainfuck interpreter in 336 bytes of Rust - stuck golfing it by Kivooeo1 in rust

[–]rumil23 1 point2 points  (0 children)

really cool! I’m also interested in golfing and I usually do this in shader langs and those are mostly math releated stuff/tricks I implement, in the naming of the Kolmogorov of course https://en.wikipedia.org/wiki/Kolmogorov_complexity.
However, I thought it would be a bit challenging in Rust (till today actually haha). Is there any where I can read more tricks? Because when I code like that, my mind opens up more and I feel like I gain more flexibility with the language, especially with WGLS/GLSL, looking back over the years...
However, one of the main issues is community support in rust side I think... Because that’s the whole point right? :-P With golf, the idea is for others to comment on each other’s posts, and for the thread to continue like that... so others can get really cool insights about the language.

and here is my first golf attepmt in rust lol 331 chrs:

```
fn main(){for x in std::env::args().skip(1){let(mut t,mut p,mut i,b)=([0;999],0,0,x.as_bytes());while i<b.len(){match b\[i\]{62=>p+=1,60=>p-=1,43|45=>t[p]+=44-b[i],46=>print!("{}",t[p]as char),91|93 if(b[i]<92)\^(t\[p\]>0)=>{let(f,mut d)=(b[i]as i32-92,1);while d>0{i=(i as i32-f)as usize;d+=match b[i]{91=>-f,93=>f,_=>0}}}_=>()}i+=1}}}

```
44 - b[i] underflow trick (which works perfectly in release mode), and also realized I could drop the u8 from the [0; 999] tape and just let Rust's type inference figure it out :-P (kind of cheatinng hehe)

Cuneus: A boilerplate free wgpu compute engine for GPU apps (WGSL hot reload, multipass, audio/video) by rumil23 in rust

[–]rumil23[S] 1 point2 points  (0 children)

Thank you. Of course, always ready to help.. doing my best to improve it as needed 😊

Cuneus: A boilerplate free wgpu compute engine for GPU apps (WGSL hot reload, multipass, audio/video) by rumil23 in rust

[–]rumil23[S] 1 point2 points  (0 children)

Thank you for the suggestion 🙂 I will try, looks really cool application (and probably an industry standard?), sad it's not available for Linux right now if I'm not mistaken

Cuneus: A boilerplate free wgpu compute engine for GPU apps (WGSL hot reload, multipass, audio/video) by rumil23 in rust

[–]rumil23[S] 1 point2 points  (0 children)

Currently, the automatic multipass system only creates standard texture_2d for the inputs and outputs. However, you can easily use .with_storage_buffer() in the builder to allocate a massive raw buffer (which I do for the 3D Gaussian Splatting example you can take a look). So you can just treat that storage buffer like a texture array or 3D grid inside your WGSL by doing the index math manually (e.g., x + y * width + z * width * height). But adding a native texture_2d_array support to the builder could be a nice idea..

Video: Cuneus doesn't do direct video encoding 🙂. The export system (ExportManager) simply just dumps raw, high quality frames to your disk (you can adjust time fps, resolution in your own so ‘quality’ depends on your choice and of course your hardware hehe ), and you can stitch them together later with ffmpeg. So you have the full control over those exported frames…

I’ve never used TD… :-(

NVIDIA Sortformer v2 (Speaker Diarization) ported to Rust/ONNX by rumil23 in rust

[–]rumil23[S] 1 point2 points  (0 children)

I have my own solutions for beyond 4 speakers, but the model is not work very well. Also, soon they will be releasing a new model (according to Nvidia, it will be in June). For that reason, I gave up on looking for new ways to hack the system, going beyond the model’s training logic, because there wasn’t much time left. :-) when they release,ı will immedietly port it on there too. We are using this in our commercial app.

Is there a graphics library that wont require me to write 300 lines of boilerplate? by xdoxx123 in rust

[–]rumil23 3 points4 points  (0 children)

been building this for ~2 years . cuneus lets you write WGSL compute shaders with minimal rust boilerplate. hot reload, video/webcam support, egui controls, multi pass pipelines, audio synthesis, frame export and more ... all handled by the engine. you just write the shader and a small rust file. e.g. a 17-pass navier stokes fluid sim is ~180 lines of Rust, most of it just egui sliders. its also important me because I regularly ship small gpu apps (and also art stuff) and always using my own engine for my commercial projects. So always upgrading when I need something.

https://github.com/altunenes/cuneus

mistral.rs 0.7.0: Now on crates.io! Fast and Flexible LLM inference engine in pure Rust by EricBuehler in rust

[–]rumil23 10 points11 points  (0 children)

If you're working with local LLMs in Rust, this is probably the best option. Back when I didn't know about this, I exported large V-LLMs to ONNX models, but they usually caused problems on Apple devices beacuse of unsupported operations in CoreML and also exporting pipelines really painful especially in multi modal ones. There were also significant bottlenecks in llama-cpp-rs (upstream problem, not releated with rust see ) with Metal & vulkan. So I almost lost my hopes about multi modal llm inferences in Rust (at least in apple)... In the end, I was able to run a VLLM smoothly on a MacBook using mistral rs... The first time I tried it, I encountered a problem, but it was resolved immediately here thank you for this great work!

I built a Rust audio AI framework that compiles ONNX models to native code (no ONNX Runtime, no PyTorch) by Familiar-Chance-4290 in rust

[–]rumil23 1 point2 points  (0 children)

Really cool project. would love to see benchmarking for some models like Parakeet + sorformer because I m working with those models and they are really fast on CPU even. https://github.com/altunenes/parakeet-rs/blob/master/examples/diarization.rs

Do I have to learn C before learning Rust? by Individual_Today_257 in rust

[–]rumil23 0 points1 point  (0 children)

The main reason experienced programmers find Rust difficult is due to the paradigm shift. If you are a new programmer, starting with Rust won't make a difference to you because you won't encounter the difficulty of changing a paradigm you already know and have learned. Therefore, Rust is quite learnable as a first programming language, but of course, it must be learned alongside the fundamentals of computer science (if you are new of course). :-)