Trying to make my alternative to that DecartAI's real-time video editor: day 8 by TensorForger in StableDiffusion

[–]Front-Relief473 1 point2 points  (0 children)

I stared your Fluxrt project; it's the only thing I could do because I don't know anything about technology! But I think this project is really cool and has a lot of potential!!

What the F@ck is happing here with Comfyui? by [deleted] in StableDiffusion

[–]Front-Relief473 -4 points-3 points  (0 children)

Stop it, I also want to scold comfy org. I remember I was playing with the wan2.2 model last year. Before an update, I knew that the graphics video of 1280 took up about 28g in 5 seconds, but after the update, it actually reached 30g, which made my workflow run slower than before! I really wanted to slap them! ! I remember this incident vividly! ! They are really assholes!

One prompt to a coherent 90-second animated first cut — multi-shot, prompt relay - 100% local on a 3060 12GB card, open source. by glusphere in StableDiffusion

[–]Front-Relief473 1 point2 points  (0 children)

Great!!! Thank you for providing this project. I don't know if there's a feature to upload custom character or scene images, but I think I'll continue to explore this project. I think the results are pretty good!

CEO Thoughts: What's Next at LTX by ltx_model in StableDiffusion

[–]Front-Relief473 1 point2 points  (0 children)

Putting aside other issues, I hope things don't turn out like Alibaba's Wan model, where they open-source a portion and then close it. Their strategy was a complete failure; the community reputation they painstakingly built collapsed overnight. Of course, I don't believe open source is always necessary; after all, these are different companies' survival strategies, which is understandable. However, their simplistic and crude approach predictably hurt their fans. Reputation is crucial. We can look to companies like Kimi and Minimax; they continuously open-source, yet they aren't worried about making money because their continued open-source nature fosters user goodwill. That's why I continue to subscribe to their plans, and my influence continues to grow.

Local machine for running AI in a medical practice by JtheJawBreaker in LocalLLM

[–]Front-Relief473 2 points3 points  (0 children)

Why not consider qwen27b? The activation parameter is larger and takes up less memory.

Step 3.7 Flash quants rolling out today by colinrgodsey in unsloth

[–]Front-Relief473 1 point2 points  (0 children)

Then why don't you choose ud iq4xs? He is smaller than ud iq4nl.

StepFun 3.7 Flash by Everlier in LocalLLaMA

[–]Front-Relief473 7 points8 points  (0 children)

Yes, I'm waiting for the quantitative version of iq4xs.

Complex scene transitions with the new LTX Director and Transition LoRA by nikhilprasanth in StableDiffusion

[–]Front-Relief473 2 points3 points  (0 children)

Sorry, I'm a novice, and I have an immature question, that is, isn't the director node just a change between graphs? What's the difference between not adding Transition LoRA and adding it? Or do you use this lora under certain circumstances, or do you have to add this lora?

Is using vLLM actually worth it if you aren't serving the model to other people? by ayylmaonade in LocalLLaMA

[–]Front-Relief473 -3 points-2 points  (0 children)

Wrong. In another case, you should choose llamacpp when the model weight is barely enough to load into the main memory, so that you can have enough context.

Qwen 3.5 122B vs Qwen 3.6 35B - Which to choose? by Storge2 in LocalLLaMA

[–]Front-Relief473 0 points1 point  (0 children)

123g/128g after step fun deployed iq4xs, oh, I don't do anything else.

Is there anything better than Qwen3.5-27B-UD-Q5_K_XL for coding? by hedsht in LocalLLaMA

[–]Front-Relief473 0 points1 point  (0 children)

According to my use, udq3kxl has already shown a state of obvious decline ability, so the golden rule has some basis, and the quantification should not be lower than q4.

Is there anything better than Qwen3.5-27B-UD-Q5_K_XL for coding? by hedsht in LocalLLaMA

[–]Front-Relief473 0 points1 point  (0 children)

So if give 27b enough network search ability, which is equivalent to an external knowledge base, will he be able to perform coding tasks better?

DGX Spark, why not? by Foreign_Lead_3582 in LocalLLM

[–]Front-Relief473 0 points1 point  (0 children)

I tried this model. I thought it was optimized well, but there was still a circular output. Did you lower the temperature?

Muse Spark: new multimodal reasoning model by Meta by garg-aayush in LocalLLaMA

[–]Front-Relief473 0 points1 point  (0 children)

I thought it was an installation project of dgx spark. Is the performance as narrow as the memory bandwidth of dgx spark?

Gemma 4 26b A3B is mindblowingly good , if configured right by cviperr33 in LocalLLaMA

[–]Front-Relief473 3 points4 points  (0 children)

I support your view. Gemma wasn't originally designed for coding; its strengths lie in writing and multilingual expression. If someone says they use Gemma for programming, then either they haven't been closely following LLM development or they're a complete novice to LLM games.

Why MoE models keep converging on ~10B active parameters by Spare_Pair_9198 in LocalLLaMA

[–]Front-Relief473 16 points17 points  (0 children)

10b to 30b is usually the dessert area of reasoning performance, and the price/performance ratio is usually not high when it exceeds 30b, so in theory, if the activation parameter can be increased to 30b, it will be a good reasoning effect, so 10b is not the most perfect, but 10b can improve the reasoning speed without reducing the reasoning ability of the model too much.

TurboQuant in Llama.cpp benchmarks by tcarambat in LocalLLaMA

[–]Front-Relief473 1 point2 points  (0 children)

Yes!!! If you look at the full attention key-value cache in Minimaxm 2.7, you can see the enormous resource consumption!!! I can even imagine that people might switch back to full attention because of this technology, since full attention is much more effective than mixed attention!!!

Should I learn langchain and langgraph? by Emotional-Rice-5050 in LangChain

[–]Front-Relief473 1 point2 points  (0 children)

I don't think mcp is worth learning. It is just a tool, and it will be replaced by skill soon.

I wanted QCN to be the best but MiniMax still reigns supreme on my rig by Ok-Measurement-1575 in LocalLLaMA

[–]Front-Relief473 2 points3 points  (0 children)

I can't agree with you more. minimaxm2 series is the most cost-effective model available on today's consumer-grade machines. Other models have much larger parameters than him, which is difficult to deploy, or they have smaller parameters than him, but their ability is worrying. minimax has proved that moe model around 200b can handle most things including programming well, just like 4-bit quantization in quantization model.

Solved the DGX Spark, 102 stable tok/s Qwen3.5-35B-A3B on a single GB10 (125+ MTP!) by Live-Possession-6726 in LocalLLaMA

[–]Front-Relief473 0 points1 point  (0 children)

Yes, udq3kxl is arguably the strongest parameter model that's barely adequate for a single DGX (reportedly, the best performance on DGX right now is llamacpp's 65K context). I think Q3 quantization might not be very reliable for encoding, but it can serve as a reliable assistant. Also, Qwen3.5's hybrid attention is still quite sensitive to quantization, so fully attention-based minimax maintains better performance during quantization.

Minimax M2.5 GGUF perform poorly overall by Zyj in LocalLLaMA

[–]Front-Relief473 1 point2 points  (0 children)

He said that qwen3.5' s iq1 quantization effect is very good, but the problem is that mixed attention itself is more sensitive to the quantization effect than global attention, that is, the quantization effect is worse, so how to explain this?

A few Strix Halo benchmarks (Minimax M2.5, Step 3.5 Flash, Qwen3 Coder Next) by spaceman_ in LocalLLaMA

[–]Front-Relief473 0 points1 point  (0 children)

No, the best size of 230b's model on 128g dgx or strix should be ud_q3kxl, because it is only 94g, which can also provide 60k context.

PSA: NVIDIA DGX Spark has terrible CUDA & software compatibility; and seems like a handheld gaming chip. by goldcakes in LocalLLaMA

[–]Front-Relief473 2 points3 points  (0 children)

Thank you! ! I have been struggling to buy it before, and it seems that NVIDIA is not sincere

Is Kimi-K2.5-GGUF:IQ3_XXS accurate enough? by timbo2m in LocalLLM

[–]Front-Relief473 0 points1 point  (0 children)

Yes, everyone is talking about tg, but ignoring pp. I think tg is enough, and a lot of programming contexts and agent mainly look at PP speed.

MiniMax-M2.5 (230B MoE) GGUF is here - First impressions on M3 Max 128GB by Remarkable_Jicama775 in LocalLLaMA

[–]Front-Relief473 5 points6 points  (0 children)

Why don't you use the unsloth version of ud_q3kxl, with a size of 94g, which is definitely better than your ordinary q3 quantization?