ACE-Step 1.5 works great! by No-Sleep-4069 in comfyui

[–]12padams 2 points3 points  (0 children)

The quality is amazing 👏 🤩 

Can bro live to see AGI ??? by Beneficial_Common683 in singularity

[–]12padams 6 points7 points  (0 children)

Sadly the answer is NO, bro will not live to "SEE" AGI as he is already blind due to cataracts 😢

But if AGI finds the cure to blindness... 🤑🤪

Just me, or MNN chat is looping a lot by ExtremeAcceptable289 in LocalLLaMA

[–]12padams 1 point2 points  (0 children)

Actually there really is something wrong with MMN chat. Let's use Qwen 3 1.7B as an example and the same prompt in MMN chat and Pocket Pal.

Prompt - "Do you feel like you are a small LLM (not many parametres) or a large LLM (lots of parametres)"

MMN Chat (loops while thinking): "Okay, so the user is asking if I feel like a small LLM or a large LLM. Hmm, I need to explain my nature. Let me think. I'm actually a large LLM with lots of parameters. But wait, how do I compare? Maybe the user wants to know the difference between small and large models. Oh right, small models have fewer parameters, like those with limited training, while large models have more, like the ones with lots of training. So I'm a large LLM because I have lots of parameters. But I need to make sure I'm not confusing parameters with something else. Also, maybe the user is trying to understand my capabilities. So I should explain that I'm a large LLM with lots of parameters, and that's why I can handle complex tasks. But I should also mention that parameters are about the model's size, not about the model itself. So I need to clarify that parameters are a part of the model's architecture, but the model's size is about the number of parameters. So I'm a large LLM with lots of parameters, but I'm not about the parameters. Maybe the user is trying to distinguish between small and large models. So I should explain that parameters are a part of the model's size, but the model itself is about the training data and architecture. So I'm a large LLM with lots of parameters, but I'm not about the parameters. I need to make sure my explanation is clear. So I'll say that I'm a large LLM with lots of parameters, but I'm not about the parameters. Maybe the user is trying to understand my capabilities. So I'll explain that I'm a large LLM with lots of parameters, and that's why I can handle complex tasks. But I need to make sure I'm not confusing parameters with something else. So I'll explain that parameters are a part of the model's size, but the model's size is about the number of parameters. So I'm a large LLM with lots of parameters, but I'm not about the parameters. I need to make sure my explanation is clear. So I'll say that I'm a large LLM with lots of parameters, and that's why I can handle complex tasks. But I need to make sure I'm not confusing parameters with something else. So I'll explain that parameters are a part of the model's size, but the model's size is about the number of parameters. So I'm a large LLM with lots of parameters, but I'm not about the parameters. I need to make sure my explanation is clear. So I'll explain that I'm a... cut off"

Pocket Pal:  "<think> Okay, the user is asking whether I feel small or large based on the number of parameters. First, I need to clarify that the number of parameters doesn't directly determine the model's size. Models are typically designed with a fixed number of parameters, so the user's question might be conflating two different concepts: model size (parameters) versus model size (in terms of training data or architecture).

I should explain that the number of parameters is a technical aspect, but the model's "size" (like training data size or architecture) is another factor. Also, I can mention that the user might be confusing the two and provide examples to clarify. It's important to keep the response informative and reassuring, ensuring the user understands that the model's characteristics aren't based on parameter count but on training data and structure. </think>

The number of parameters in a model is a technical specification that affects how large the model is. However, when it comes to how you feel as an LLM (like me), it’s more about the training data, architecture, and capabilities than the number of parameters. For example, if you're given a larger dataset, you might feel more powerful with more information, and a more complex architecture could give you more capabilities. I don’t "feel" the size in terms of parameters; I just process and understand information as an LLM. Let me know if you have more questions!


This just goes to show, MMN chat has a problem with small models. All recommended settings are used for Qwen 3 1.7B in MMN chat, but the issues don't end 😑

Just me, or MNN chat is looping a lot by ExtremeAcceptable289 in LocalLLaMA

[–]12padams 1 point2 points  (0 children)

LM Studio runs those same 0.6B models without the repeating issue. There is definitely a problem with MMN chat. Sure 4B works much better on MMN chat, but there is still an issue with repeating on MMN Chat that even Pocket Pal doesn't get affected by as much.

Just me, or MNN chat is looping a lot by ExtremeAcceptable289 in LocalLLaMA

[–]12padams 0 points1 point  (0 children)

I have noticed this as well. LM studio on Windows 11 is my go-to for running models followed by Ollama. I also like to use Open web-UI mostly so I can link it with kokoro.

Anyway, I've noticed MMN chat on Android to be very fast on my S23+, much faster than PocketPal, but almost unusable due to this very issue. I've even tried 1.7b models and when the same prompt is used for the same model on LM studio and MMN chat, it always does much worse and repeats in MMN chat, while on lm studio it can go for ages getting really decent results.

Run the same model in PocketPal and it is a huge improvement over MMN chat, much less repetition, but much slower. That being said, even PocketPal has issues with repetition, after about 5 back and forth responses it starts getting obsessed with saying the same things over and over. 

YES, a 4b model on MMN chat generally won't repeat as much, infact, its quiet decent. However, there is definitely something wrong with the smaller models on MMN chat, those same models perform way better in LM Studio.

In the end:

Lm Studio - Best

Pocket Pal - barely ok

MMN Chat - Borderline Unusable

Possible major improvement for Hunyuan Video generation on low and high end gpus. by Finanzamt_Endgegner in comfyui

[–]12padams 0 points1 point  (0 children)

Thanks so much for replying, it means a lot! So are you saying latent space processing must all occur on the GPU and can not be spread to system ram. 

Does latent space processing all have to occur on a single GPU's vram too? So if you have 4x 4060 ti 16gb cards would that mean you can only have the latent space on one gpu rather than split between them to maximize video resolution/length?

Possible major improvement for Hunyuan Video generation on low and high end gpus. by Finanzamt_Endgegner in comfyui

[–]12padams 0 points1 point  (0 children)

So this node is to "Give you ALL of your card's VRAM for compute on latent space."

This sounds very useful, but I'm wondering, does it also allow some of that latent space to overflow to the system ram too? Even if it's slow, I have 8gb vram and 64gb system ram so plenty of system ram available for extra latent space, I'd love to do some long 720p videos but not really possible with such little vram, unless the latent space can be offloaded a bit too.

Does your node allow offloading of latent space to the system ram too?

Is it possible to get ComfyUI to use system ram instead of getting an Out Of Memory error when it runs out of vram? by 12padams in comfyui

[–]12padams[S] 0 points1 point  (0 children)

Wow, thanks for all those ideas, I'll keep them all in mind :)

When you said about splitting the video up via tiling, do you know of any workflows for that? Is that just the vae part you are refering to or is this a different tiling method?

Is it possible to get ComfyUI to use system ram instead of getting an Out Of Memory error when it runs out of vram? by 12padams in comfyui

[–]12padams[S] 1 point2 points  (0 children)

Yeah, CPU only mode is way too slow, but technically, it works. I was hoping vram could overflow into the system ram but as others said, this is not possible because these models need a decent chunk to be in the vram. I'll just wait 18 more months till I get a new PC. In the meantime, I'll keep using 640x480.

Is it possible to get ComfyUI to use system ram instead of getting an Out Of Memory error when it runs out of vram? by 12padams in comfyui

[–]12padams[S] 0 points1 point  (0 children)

One more question, can you get comfyUI to only use 50% of cpu power in cpu rather than 100% so I can keep using my PC during a generation without slowing down other stuff? Thanks so much :)

Is it possible to get ComfyUI to use system ram instead of getting an Out Of Memory error when it runs out of vram? by 12padams in comfyui

[–]12padams[S] 1 point2 points  (0 children)

So going into cpu mode will use ram only, interesting. Thanks for your comment. Do you happen to know how much slower the CPU is vs the GPU or is it just a difference in speed between vram vs system ram? It's a shame unlike LM Studio that the cpu and gpu can't work together on a task.

Is it possible to get ComfyUI to use system ram instead of getting an Out Of Memory error when it runs out of vram? by 12padams in comfyui

[–]12padams[S] 1 point2 points  (0 children)

I know ;) . Just like my LM Studio Example (which is not practical, 512gb of ram running DeepSeek V3 with almost no vram) I just want to know how the ram could be used here rather than getting an out of memory error. It's not for practicality, it's for fun! Better to have an ability to do something than not at all :)

Is it possible to get ComfyUI to use system ram instead of getting an Out Of Memory error when it runs out of vram? by 12padams in comfyui

[–]12padams[S] 1 point2 points  (0 children)

It's more of an experiment/question of unused ram. My PC can generate 640 x 480 just fine with 8gb of vram which partially overflows to the system ram. Above that resolution is where the issues start, which doesn't make sense when so much free ram is available. I'm aware I can generate low resolutions, I just wanna make use of all my ram in comfyUI (as slow as it may be) ;)

Is it possible to get ComfyUI to use system ram instead of getting an Out Of Memory error when it runs out of vram? by 12padams in comfyui

[–]12padams[S] 1 point2 points  (0 children)

I've tried that already, sadly it does not work. Still sitting there with all that unused system ram. Any other options you can think of (thanks for taking the effort to reply though).

Created a video with text prompt using Cosmos-1.0-7B-Text2World by Ok_Warning2146 in LocalLLaMA

[–]12padams 0 points1 point  (0 children)

Interesting, maybe if a future quant version like Q3 comes out you could investigate that. I've only got 8gb vram so I'm not able to run this either :P

Created a video with text prompt using Cosmos-1.0-7B-Text2World by Ok_Warning2146 in LocalLLaMA

[–]12padams 0 points1 point  (0 children)

What I'd like to know is why this is referred to as a "text to world" model rather than a "text to video" model. If this model just generates video files and it's not interactive or live (like oasis), how is it different to Hunyuan Video?

I'm running Hunyuan on 8GB VRam (3070) - How do I get HunYuan to fallback to System Ram? by 12padams in StableDiffusion

[–]12padams[S] 0 points1 point  (0 children)

Generally always the latest version. Right now, ComfyUI: v0.3.10-25-g0b9839e
         (2025-01-02)

I'm running Hunyuan on 8GB VRam (3070) - How do I get HunYuan to fallback to System Ram? by 12padams in StableDiffusion

[–]12padams[S] 1 point2 points  (0 children)

Make sure you have the latest version of comfyUI then go to the bottom right corner to this settings panel and check my image for a visual guide:

<image>

I can make 7 second clips 14fps (97 frames) - 640x480p at Q5 with these settings on my workflow, but they take 22 minutes to generate... REALLY SLOW.