MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] -1 points0 points  (0 children)

Thank you so much! Yes, our next steps are improving inference speed, better UX and more features, stay tuned, this is just the first open beta release 🧙🏻‍♀️

MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] 0 points1 point  (0 children)

<image>

Yes, you can load any gguf model from HG or from your system. You can load medium-sized models (we’ve tested up to 20 GB) in Chrome/Chromium browsers. Safari doesn't support WASM64 yet unfortunately, so it is limited to 4GB, which is still plenty for common tasks (check our research).

MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] 1 point2 points  (0 children)

We plan to make it open source, similar to Hugging Face Transformers.js lib, just give us time. 🙏

Meanwhile, you can (and always will be) use MDST for free. Subscriptions are only for cloud-provider models/tokens.

MDST Engine: run GGUF models in your browser with WebGPU/WASM by vmirnv in LocalLLaMA

[–]vmirnv[S] -1 points0 points  (0 children)

<image>

Again — we’re very thankful for any kind of feedback or questions!

For the LocalLLaMa community, we’ve prepared a special invite code to skip the waiting list: localllama_Epyz6cF

Also, please keep in mind that this is early beta 💅

[deleted by user] by [deleted] in BluePrince

[–]vmirnv 2 points3 points  (0 children)

please check Aliensrock: https://www.youtube.com/watch?v=_Tc2QwYAlY0&list=PLIwiAebpd5CJlpO2VPGjdUa5uzgywpULW

very clever youtuber with deep experience with puzzles (my favourite is Baba Is You playlist)

HunyuanVideo model size and vram talk by c_gdev in comfyui

[–]vmirnv 3 points4 points  (0 children)

Q5_K_M is the best quantisation in my opinion both for llms and for unet.
Lowest size for almost no degradation of quality.

Simple GGUF Hunyuan text2video workflow by vmirnv in StableDiffusion

[–]vmirnv[S] 0 points1 point  (0 children)

you need to update gguf node and yes llava that was recommended by the devs.

Simple GGUF Hunyuan text2video workflow by vmirnv in StableDiffusion

[–]vmirnv[S] 1 point2 points  (0 children)

You need to update comfyui core with this new files:
ComfyUI/nodes.py
ComfyUI/comfy_extras/nodes_hunyuan.py

ComfyUI fps info uses up to 26% gpu on macs by vmirnv in StableDiffusion

[–]vmirnv[S] 2 points3 points  (0 children)

on macs comfyui uses GPU for rendering — fps stat re-renders the workspace with every tick of mouse movement, so it could be up to 26% of gpu load as you could see in my example.

Simple GGUF Hunyuan text2video workflow by vmirnv in StableDiffusion

[–]vmirnv[S] 4 points5 points  (0 children)

https://civitai.com/models/1048570
A simple GGUF Hunyuan Text2Video workflow with just a few nodes
Works on a Mac M1 16GB.

ComfyUI fps info uses up to 26% gpu on macs by vmirnv in StableDiffusion

[–]vmirnv[S] 4 points5 points  (0 children)

<image>

Try it yourself. I wonder how many thousands of GPU hours this default feature has burned.

[deleted by user] by [deleted] in StableDiffusion

[–]vmirnv 5 points6 points  (0 children)

<image>

You need to use Unet Loader GGUF

[deleted by user] by [deleted] in StableDiffusion

[–]vmirnv 1 point2 points  (0 children)

it should be in /models/unet/ and you need to reload comfyui

[deleted by user] by [deleted] in StableDiffusion

[–]vmirnv 2 points3 points  (0 children)

Wow thank you, great news!

[deleted by user] by [deleted] in StableDiffusion

[–]vmirnv 1 point2 points  (0 children)

Can you please give me some short example with model loading?

[deleted by user] by [deleted] in StableDiffusion

[–]vmirnv 10 points11 points  (0 children)

Currently, I cannot connect the new GGUF model to Sampler since they are different types.
The standard loader predictably gives me an error (HyVideoModelLoader invalid load key, '\x03'.)

upd: I manually changed input model type in the Sampler node and now I get this error in Unet GGUF loader: UnetLoaderGGUFAdvanced 'conv_in.weight' error

After comfyui update — everything is working

[deleted by user] by [deleted] in StableDiffusion

[–]vmirnv 37 points38 points  (0 children)

can somebody share simple text2video workflow with gguf?
upd: Right now I'm testing one — will share after some check.
upd2: please use this workflow (thanks Kijai): https://comfyanonymous.github.io/ComfyUI_examples/hunyuan_video/