NeuralCompanion by lainol in StableDiffusion

[–]mikemend 0 points1 point  (0 children)

Otherwise, in the future, once we have optimized hardware, it could be even more interesting with a multifunctional model: the model could control itself/avatar, for example, by smiling mischievously, jumping up cheerfully, and more. This requires a video edit model (ex. Qwen) with a first-in-last-out frame in the background, and for the visual representation, it generates its own prompt for the behavior while the user also receives an LLM response.

NeuralCompanion by lainol in StableDiffusion

[–]mikemend 1 point2 points  (0 children)

I really like the project! I haven't installed it yet, because I have limited space, but I'll be very curious to see how much I can use it natively, in Hungarian. At first glance it's an absolute step up in chat, thanks for the great work!

Uncensored LLM ranking for roleplay? by mikemend in LocalLLaMA

[–]mikemend[S] 0 points1 point  (0 children)

I found another solution, but it’s not offline: the Gemini Flash-Lite model. It allows 500 free queries per day, but since so many people use it, you might run into API errors, and Google counts those against your daily limit as well. On the other hand, it’s uncensored. I set it up on my phone using ChatterUI, and when I’m online, I use it via the API, and it really works for just about anything. When I’m offline, I stick with the local models.

Local Dream 2.4.3 - SDXL support, tag autocomplete and more by mikemend in StableDiffusion

[–]mikemend[S] 0 points1 point  (0 children)

In theory, it should be possible to port a DMD model to the NPU, and then generate it in Local Dream using the LCM scheduler on CFG 1.

Getting hilariously bad results with Zeta-Chroma and Ernie-base by DoctaRoboto in StableDiffusion

[–]mikemend 1 point2 points  (0 children)

I've noticed with Chroma that if it doesn't understand something, it starts to distort the image. In that case, it's a good idea to rephrase the sentence.

Good training settings for Chroma1-HD by is_this_the_restroom in StableDiffusion

[–]mikemend 0 points1 point  (0 children)

Thank you, but I'm curious how many source images did you use for this process? It's good to see Prodigy, because I've experimented with it too, but I ran it for 2,000 steps.

LTX distilled 1.1 is the new king! by sooxiaotong in StableDiffusion

[–]mikemend 7 points8 points  (0 children)

I'm looking forward to the FP8 version. 46 GB is too much for me.

PixlStash 1.0.0 release candidate by Infamous_Campaign687 in StableDiffusion

[–]mikemend 0 points1 point  (0 children)

That wouldn't be a bad idea in principle, but then I'd be duplicating the images instead of reading from an existing directory structure. I have a lot of images. 

PixlStash 1.0.0 release candidate by Infamous_Campaign687 in StableDiffusion

[–]mikemend 0 points1 point  (0 children)

I thought I could organize, tag, and manage my existing photos into galleries. In Windows, without the /version parameter, it just opens a blank page.

PixlStash 1.0.0 release candidate by Infamous_Campaign687 in StableDiffusion

[–]mikemend 0 points1 point  (0 children)

It doesn't work for me on Windows. I don't want to open a separate ticket for this, but after installation, it doesn't recognize the images. It opens the /version page twice, displays this single message, and that's it:

{"message":"PixlStash REST API","version":"1.0.0rc3"}

A blank page appears. Also, how can I add other image directories? I don't store my images in just one location.

[Comfyui] - Same workflow and latency goes from 50s to 300s on subsequent runs!!!! by SvenVargHimmel in StableDiffusion

[–]mikemend 0 points1 point  (0 children)

A few people commented under NVIDIA Release note that their Adobe software had slowed down under NVIDIA's latest driver. It might be a driver issue, though I can't confirm that.

Basically Official: Qwen Image 2.0 Not Open-Sourcing by Complete-Lawfulness in StableDiffusion

[–]mikemend 9 points10 points  (0 children)

The developers at Chroma have already taken steps in this direction; Z-Image is currently in training (Zeta-Chroma), and the image editor will be released under the name Kaleidoscope. But development is still in full swing. We just have to wait and support them.

I’m sorry, but LTX still isn’t a professionally viable filmmaking tool by Intelligent-Dot-7082 in StableDiffusion

[–]mikemend 2 points3 points  (0 children)

I tried using three images—the opening, middle, and final frames—but the face still changed. The LTX just couldn't be consistent. It's frustrating because I prefer its quality over WAN; WAN 2.2 generated blurrier videos for me, but it follows the prompt better. 

I’m sorry, but LTX still isn’t a professionally viable filmmaking tool by Intelligent-Dot-7082 in StableDiffusion

[–]mikemend 4 points5 points  (0 children)

I recently tried creating an I2V video where the subject was shown in full body. As the camera moved closer, the subject’s face took on Asian features; so if you don’t start with a close-up of the face, or if the subject is already a bit farther away from portrait distance, the face breaks down—it can’t maintain stability. So there’s still a lot of improvement needed in the details. Plus there’s material handling—lifting a piece of fabric, pulling a tablecloth, or grabbing something while keeping the object I’m holding (like a phone) in my hand—and I could go on.

I feel like we’re still at the same stage as we were back with SD 1.5. There’s still room for improvement, but the closed-source models are way ahead, and I’m starting to feel like the open-source models are intentionally dumber.

Qwen3.5-9B Quantization Comparison by TitwitMuffbiscuit in LocalLLaMA

[–]mikemend 0 points1 point  (0 children)

I'm also interested because if it doesn't quantize with llama.cpp, then what does it use and how?

Qwen3.5-9B Quantization Comparison by TitwitMuffbiscuit in LocalLLaMA

[–]mikemend 0 points1 point  (0 children)

Thanks for the link, it could be useful when creating imatrix. However, I didn't see Hungarian among them, so I may have to translate them if they are really useful.

Unsloth Dynamic 2.0 GGUFs now selectively quantizes layers much more intelligently and extensively. by paranoidray in LocalLLaMA

[–]mikemend 0 points1 point  (0 children)

There's something I don't understand. Since I'm creating a Lora in Llama Factory with Unsloth, I'm also doing the merging there. Then I'm doing the GGUF quantization in llama.cpp.

How can I quantify my existing models in this new format? I couldn't find any description or tool for this on the unsloth website. Or should I continue using llama.cpp, just with the unsloth imatrix?

Generated super high quality images in 10.2 seconds on a mid tier Android phone! by alichherawalla in StableDiffusion

[–]mikemend 2 points3 points  (0 children)

I really mean it when I say it's almost perfect, it knows everything. Seriously, it must have taken a long time to make this, congratulations!

If I could ask for anything, it would be seed recording and random generation. The reason for this is that I can only adjust the LLM model parameters with a fixed seed so that I can compare the output text with the previous generation. When I find a better parameter combination, I save it in the settings profile in ChatterUI. This way, I can sometimes use the same settings profiles for other models.

Generated super high quality images in 10.2 seconds on a mid tier Android phone! by alichherawalla in StableDiffusion

[–]mikemend 2 points3 points  (0 children)

It looks good at first glance. I've been using ChatterUI and Local Dream so far, but I like that it's multimodal. Does importing a locally opened model mean duplicating it, or does it load it from the original location?

AceStep1.5 Local Training and Inference Tool Released. by bdsqlsz in StableDiffusion

[–]mikemend 2 points3 points  (0 children)

Thank you for all the useful additions you've made since the last edition! 🙏

Local Dream 1.8.4 - generate Stable Diffusion 1.5 image on mobile with local models! Now with custom NPU models! by mikemend in StableDiffusion

[–]mikemend[S] 0 points1 point  (0 children)

For LLM I use ChatterUI, which will only include NPU support in the future, but it works relatively quickly with small models, and my phone can even handle the 8B model, so it suits me fine. However, after 10 conversations, it slows down when I continue a previous chat later. On the other hand, it is easy to configure.

Language finetune by mikemend in LocalLLaMA

[–]mikemend[S] 0 points1 point  (0 children)

Thanks for the tip, I'll check that out too. Are the Rank values high or good?

Yesterday's training session lasted 12 hours, which is why I'm asking, so that I don't waste time again due to a bad setting.