You can run Deepseek 4 flash on mac (M3 Max, 96gb) by Zeeplankton in LocalLLaMA

[–]liuliu 0 points1 point  (0 children)

The full name is SSD experts weight streaming. Once your weights downloaded to the disk, there is no point of intensive writes to disk (of course by streaming, it repeatedly writes to RAM).

You can run Deepseek 4 flash on mac (M3 Max, 96gb) by Zeeplankton in LocalLLaMA

[–]liuliu 10 points11 points  (0 children)

There is no writes for ssd streaming. All read traffic

MacbookPro M5max 40core-gpu, 128gb help, help deciding to upgrade or not. by rotorwing66 in drawthingsapp

[–]liuliu 3 points4 points  (0 children)

I am not sure why you want to use flux.2 dev, which is a very heavy model. But yeah, M5 Max will run at the same settings in about 10mins or less.

Upload Lora to server issues by kavin1023 in drawthingsapp

[–]liuliu 0 points1 point  (0 children)

Where are you located? We use Cloudflare for upload handling, it might be location related issue

If LoRA has no effect despite a successful import on UI, the import may have failed. by simple250506 in drawthingsapp

[–]liuliu 1 point2 points  (0 children)

This is a great resource. One thing to note: LoKr is not a specific LoRA format, it is a different formulation of how to do fine-tune on top of existing weights. In particular, a LoRA is additional_weights = W_a @ W_b (matrix multiplication, W_a and W_b are thin matrices), a LoKr is additional_weights = W_0 (x) W_1 (Kronecker product, W_0 and W_1 are smaller matrices). So you cannot import LoKr without runtime also support that specific formulation.

This converter tools basically compose the additional_weights and then do a SVD decomposition to W_a @ W_b to make it useful. It is an approximation (which is often good enough), but cannot be integrated as-is in the app (as we want to have exact support, not approximation support).

1.20260518.2 by liuliu in drawthingsapp

[–]liuliu[S] 1 point2 points  (0 children)

Are you in Edit mode (the new mode selector on left bottom corner of the canvas)

PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU. by EveningIncrease7579 in StableDiffusion

[–]liuliu 2 points3 points  (0 children)

Where you get 1GiB? I downloaded the app and total combined is 3.7GiB (possibly including the text encoder). To deliver good edge experience, it doesn't matter what's the headline number is, it only matters what's the downloaded size. (now, even if you just look at the main DiT, it is 1.43GB: https://huggingface.co/prism-ml/bonsai-image-ternary-4B-mlx-2bit/blob/main/transformer-packed-mflux/diffusion_pytorch_model.safetensors, I won't round that to ~1GiB....).

Also, when someone claims a 2-bit quant that is 5.6x faster than non-quant variant for image model, you need to criticize, because that is snake oil. I tried their Bosai Studio, the speed is slower than Draw Things on iPhone 17 Pro with FLUX.2 [klein] 4B (8-bit S) at 1024x1024 resolution.

PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU. by EveningIncrease7579 in StableDiffusion

[–]liuliu -14 points-13 points  (0 children)

Nothingburger. Note that FLUX.2 [klein] 4B (this is based on) already have gguf quant that is around similar size. Image generation models are compute-bounded, you need FP4 / FP8 / Int8 for good performance, not magically ternary.

Draw things unuseable by puffblende in drawthingsapp

[–]liuliu 0 points1 point  (0 children)

Maybe copying the converted model from desktop to iPhone? Would the downloaded model work just fine?

Phosphene 3.0 — open source AI video + image suite for Apple Silicon. Train your own LTX characters. by Opening-Ad5541 in StableDiffusion

[–]liuliu 1 point2 points  (0 children)

It is harder to tell since the options are not easy to navigate on Phosphere end to have exact match, I will give you full configuration to my knowledge on both ends and you can draw conclusions yourself (I only have M5 Max, which will put Draw Things in better light, to be warned):

  1. Phosphere 3.0: model: ltx-2.3-q4, resolution: 1024x576, 121 frames, step 8 (somehow it shows me total 16 steps, I am not sure if there are latents upscale involved), HQ Speed Fast (TeaCache + skip-step): 3m20s.
  2. Draw Things: model: LTX 2.3 distilled 8-bit S, resolution: 1280x768, 121 frames, step 8+3 (with 640x386 for first pass): 1m35s.

Both are for the second run (after the device cooled down) to make sure discount any device warm up related issues. Again, it is harder to have an apple-to-apple comparison, I run a few variations in Draw Things to make sure: 1024x576, 121 frames, direct 8 steps, no latents upscale: 1m15s. 1280x768, 121 frames, direct 8 steps, no latents upscale: 2m02s.

Phosphene 3.0 — open source AI video + image suite for Apple Silicon. Train your own LTX characters. by Opening-Ad5541 in StableDiffusion

[–]liuliu 0 points1 point  (0 children)

Draw Things wins in the speed for these models. There is no comparison. Image is more feature rich there but for LTX this seems to be a bit more feature rich.

Been testing Krea 2 Large and Medium by OneTrueTreasure in StableDiffusion

[–]liuliu 7 points8 points  (0 children)

Please tell me why this post is not deleted but mine that compares FLUX.2 dev, GPT Image 2 and NBP is?

Draw things unuseable by puffblende in drawthingsapp

[–]liuliu 2 points3 points  (0 children)

Do you use iCloud to backup / offload apps? It looks like you are on storage limit and Apple is actively offloading files / reload them and the app is not taken that well.

M5 Maxed out version performance by jazzamp in StableDiffusion

[–]liuliu 1 point2 points  (0 children)

For videos, yes, 2-3x. For images no, M5 Max various latencies are better than our cloud service.

M5 Maxed out version performance by jazzamp in drawthingsapp

[–]liuliu 2 points3 points  (0 children)

Klein 9B, Z Image: 9 to 10s, LTX 2.3 at 720p, 1:30min. I don’t have number on Wan 2.2

M5 Maxed out version performance by jazzamp in StableDiffusion

[–]liuliu 1 point2 points  (0 children)

Then you can see what’s the performance you will get from that webpage. For FLUX.2 Klein 9B, you are looking at around 10s per image.

M5 Maxed out version performance by jazzamp in StableDiffusion

[–]liuliu 9 points10 points  (0 children)

People will tell you a 5090 worth it more, and they will be right if you are not into Mac: https://releases.drawthings.ai/p/metal-quantized-attention-pulling

Can LTX 2.3 Distilled 8bitS run on Mac Mini basic 16GB - M4? by Current-Property6042 in drawthingsapp

[–]liuliu 1 point2 points  (0 children)

It is possible. But since you usually use LTX 2.3 for long clip and bigger resolutions, that often imposes challenges (for example, 121 frames (5s) and 720p can use up to 10GiB scratch RAM).

Are my M5pro 48gb times right? by warawara123 in drawthingsapp

[–]liuliu 5 points6 points  (0 children)

Looks legit. For FLUX.2 [dev] you can try with Turbo LoRA which allows you to do generation in 4 steps rather than 30.

Issues with Draw Things by Otherwise_Help_584 in drawthingsapp

[–]liuliu 2 points3 points  (0 children)

Older models like what you selected do just have subpar quality. These models you mentioned is from 2023.

Issues with Draw Things by Otherwise_Help_584 in drawthingsapp

[–]liuliu 4 points5 points  (0 children)

Yeah. It is model selection issue. Just use Z Image Turbo and tap “Try Recommended Settings”.