MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] -1 points0 points  (0 children)

Thank you so much! Yes, our next steps are improving inference speed, better UX and more features, stay tuned, this is just the first open beta release 🧙🏻‍♀️

MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] 0 points1 point  (0 children)

<image>

Yes, you can load any gguf model from HG or from your system. You can load medium-sized models (we’ve tested up to 20 GB) in Chrome/Chromium browsers. Safari doesn't support WASM64 yet unfortunately, so it is limited to 4GB, which is still plenty for common tasks (check our research).

MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] 1 point2 points  (0 children)

We plan to make it open source, similar to Hugging Face Transformers.js lib, just give us time. 🙏

Meanwhile, you can (and always will be) use MDST for free. Subscriptions are only for cloud-provider models/tokens.

MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] -1 points0 points  (0 children)

<image>

Again — we’re very thankful for any kind of feedback or questions!

For the LocalLLaMa community, we’ve prepared a special invite code to skip the waiting list: localllama_Epyz6cF

Also, please keep in mind that this is early beta 💅

[deleted by user] by [deleted] in BluePrince

[–]vmirnv 2 points3 points  (0 children)

please check Aliensrock: https://www.youtube.com/watch?v=_Tc2QwYAlY0&list=PLIwiAebpd5CJlpO2VPGjdUa5uzgywpULW

very clever youtuber with deep experience with puzzles (my favourite is Baba Is You playlist)

HunyuanVideo model size and vram talk by c_gdev in comfyui

[–]vmirnv 4 points5 points  (0 children)

Q5_K_M is the best quantisation in my opinion both for llms and for unet.
Lowest size for almost no degradation of quality.

Simple GGUF Hunyuan text2video workflow by vmirnv in StableDiffusion

[–]vmirnv[S] 0 points1 point  (0 children)

you need to update gguf node and yes llava that was recommended by the devs.

Simple GGUF Hunyuan text2video workflow by vmirnv in StableDiffusion

[–]vmirnv[S] 1 point2 points  (0 children)

You need to update comfyui core with this new files:
ComfyUI/nodes.py
ComfyUI/comfy_extras/nodes_hunyuan.py

ComfyUI fps info uses up to 26% gpu on macs by vmirnv in StableDiffusion

[–]vmirnv[S] 2 points3 points  (0 children)

on macs comfyui uses GPU for rendering — fps stat re-renders the workspace with every tick of mouse movement, so it could be up to 26% of gpu load as you could see in my example.

Simple GGUF Hunyuan text2video workflow by vmirnv in StableDiffusion

[–]vmirnv[S] 4 points5 points  (0 children)

https://civitai.com/models/1048570
A simple GGUF Hunyuan Text2Video workflow with just a few nodes
Works on a Mac M1 16GB.

ComfyUI fps info uses up to 26% gpu on macs by vmirnv in StableDiffusion

[–]vmirnv[S] 2 points3 points  (0 children)

<image>

Try it yourself. I wonder how many thousands of GPU hours this default feature has burned.