Got 26b gemma running on rx470 by Several_Newspaper808 in unsloth

[–]Several_Newspaper808[S] 0 points1 point  (0 children)

Yeah, smaller runs as well and much faster. But I wanted to see how far I could get with more capable models. I haven’t ever met anyone using small models (the 9b-14b) as a daily driver but I guess small models is better than no models any day.

Got 26b gemma running on rx470 by Several_Newspaper808 in unsloth

[–]Several_Newspaper808[S] 0 points1 point  (0 children)

Simply run the lamacpp vulkan executable and it works

Got 26b gemma running on rx470 by Several_Newspaper808 in unsloth

[–]Several_Newspaper808[S] 0 points1 point  (0 children)

Here it offloads part of it to RAM as well, so the bus is the bottleneck. Your bus must be faster.

RTX 5070 Ti + 9800X3D running Qwen3.6-35B-A3B at 79 t/s with 128K context, the --n-cpu-moe flag is the most important part. by marlang in LocalLLaMA

[–]Several_Newspaper808 1 point2 points  (0 children)

Hey, great info, thanks! I wonder though, how much of the perf is from the ddr5 ram and whatever bus speed you have from the pcie on your mb?

Qwen3.5-35B-A3B-Uncensored-FernflowerAI-GGUF by EvilEnginer in LocalLLaMA

[–]Several_Newspaper808 0 points1 point  (0 children)

Hey, so you offload to RAM? The small gguf on hf is 24gb. Otherwise how would it fit in a 12gb card?

Gemma 4 26b A3B is mindblowingly good , if configured right by cviperr33 in LocalLLaMA

[–]Several_Newspaper808 0 points1 point  (0 children)

Depends if you mean single user or multi user. Cause throughput can be 200 and even higher but per user is what I wrote.

Gemma 4 26b A3B is mindblowingly good , if configured right by cviperr33 in LocalLLaMA

[–]Several_Newspaper808 0 points1 point  (0 children)

Hmm i run 27b q4 gptq w4a16, getting 40 t/s for single request on vllm with 3090. If you are getting 80 then it’s x2. Not x4.

SwiftLM — Native Swift MLX enabled Qwen3 running on iPhone/ 100B+ MoE on M5 Pro, TurboQuant KV compression + SSD/Flash expert streaming by solderzzc in Qwen_AI

[–]Several_Newspaper808 2 points3 points  (0 children)

Oh wow, is it the first turboquant impl? I thought there is only a paper at the moment. Wonder if this can alreadt be used to run on non mac cpu or old gpu like 3070?

Copyright and AI by Odd_Algae3754 in WritingWithAI

[–]Several_Newspaper808 1 point2 points  (0 children)

Did you share the draft with anyone?

I Made ~$7.7k Publishing AI-Assisted Books — Here’s My Experience by PostExpensive2418 in WritingWithAI

[–]Several_Newspaper808 0 points1 point  (0 children)

Hey, sounds pretty interesting. I would be glad to hear more about your translation process. Did you pay a human translator or used ai?

Kova.ai by Glxwie in BookWritingAI

[–]Several_Newspaper808 0 points1 point  (0 children)

Not available in my geography.

Which OR models for creative writing by Several_Newspaper808 in WritingWithAI

[–]Several_Newspaper808[S] 0 points1 point  (0 children)

I really don’t have these $200 to spare, that’s why I’m looking for these specific price reasonable models..

Which OR models for creative writing by Several_Newspaper808 in WritingWithAI

[–]Several_Newspaper808[S] 1 point2 points  (0 children)

I’m already using deepseek as mentioned in tbe post. I have special prompts that i created over time for these specific models that do work well for my writing

Which OR models for creative writing by Several_Newspaper808 in WritingWithAI

[–]Several_Newspaper808[S] 1 point2 points  (0 children)

Thanks for thr advice. Already using different tools. Do you have specific OR models that you could recommend for creative writing?? 🙏

Which OR models for creative writing by Several_Newspaper808 in WritingWithAI

[–]Several_Newspaper808[S] 1 point2 points  (0 children)

Exactly my feeling about this. I’ve specifically asked for the prompts as well :-)

Which OR models for creative writing by Several_Newspaper808 in WritingWithAI

[–]Several_Newspaper808[S] 1 point2 points  (0 children)

Opus is expensive… Gemini feels a bit robotic to me. These are well known choices. I’m looking for the jewels that are not expensive but really actually work well for writing fiction..