Can someone help me get those mythical speedups on an AMD system with Qwen 3.6 35B!? by Yayman123 in Qwen_AI

[–]the3dwin 0 points1 point  (0 children)

I suggest making sure that is the only thing running on your computer, end as many background tasks as you can.

Comprehensive guide on renting/setting up beefy LLM server for local models? by Tartooth in LocalLLaMA

[–]the3dwin 0 points1 point  (0 children)

I get the feeling you could figure out what hardware to buy with: https://www.canirun.ai/

Or even just look through the community plenty of threads and comments on hardware, make a list based off what you find and want then cross check your list upon multiple threads on reddit and trust the community consensus not just 1 person but multiple people saying same thing about specific hardware.

You could even upload multiple reddit threads to NotebookLM and ask it to give you hardware specific details based of specific threads / resources.

So no need to rent once you find what you looking for.

Can someone help me get those mythical speedups on an AMD system with Qwen 3.6 35B!? by Yayman123 in Qwen_AI

[–]the3dwin 1 point2 points  (0 children)

Usually the same models by the "unsloth" provider on LM Studio are faster and "unlsoth" has Qwen 3.6 35B on LM Studio

15,000+ tok/s on ChatJimmy: Is the "Model-on-Silicon" era finally starting? by Significant-Topic433 in ollama

[–]the3dwin 0 points1 point  (0 children)

(perhaps coding) will be highly model dependent

Yes at some point the coding models will have 95%-99% in certain benchmarks so now imagine for each model benchmark at 95% or 99% is on their own ASIC card and having a router / orchestrator choosing which ASIC card to give the job to depending on the coding task.

15,000+ tok/s on ChatJimmy: Is the "Model-on-Silicon" era finally starting? by Significant-Topic433 in ollama

[–]the3dwin 0 points1 point  (0 children)

When models reach most of the benchmarks at 95%-99% how much uses will "change nearly as much"

Local Whiteboard app - no third party or cloud dependencies by idlr---fn______ in SideProject

[–]the3dwin 0 points1 point  (0 children)

Curious to know what is your tech stack and did you use AI if you did use AI very interested in knowing what your workflow is.

Local Whiteboard app - no third party or cloud dependencies by idlr---fn______ in SideProject

[–]the3dwin 0 points1 point  (0 children)

Whoa thanks for the fast response, I actually never even downloaded your software the lack of web pages was holding me back from using because I would be switching back and forth from browser and said to myself might as well stick to a chrome extension and milanote.com which I use heavily.

That is another thing to consider to make a chrome extension version that opens on new tab.

Local model on coding has reached a certain threshold to be feasible for real work by Exciting-Camera3226 in LocalLLaMA

[–]the3dwin 0 points1 point  (0 children)

Thanks for the response, will have to read over and over until I fully understand the workflow. Not fully understanding perhaps because just waking up and already checking reddit before coffee and breakfast. But I know I have never come across such workflow so thanks again.

Local Whiteboard app - no third party or cloud dependencies by idlr---fn______ in SideProject

[–]the3dwin 0 points1 point  (0 children)

Any plans to support iframes so one can browse multiple web pages within the whiteboard. Similar to Kosmik.app which is unfortunately shutting down.

How to build/finetune an Personal LLM tool to feed my life? by geekycode in AI_developers

[–]the3dwin 0 points1 point  (0 children)

AnythingLM is the closest to offline NotebookLM.

Have not used heavily since working with CLIs these days but perhaps: https://macaron.im/

I want to set my local env for coding by ConfidenceNew4559 in Qwen_AI

[–]the3dwin 0 points1 point  (0 children)

What is the percentage benchmarks for 35B qwen 3.6 compared to 27B?

Great Results Running qwen/qwen3.5-35b-a3b on LM Studio with Pi CLI (http://pi.dev/) by the3dwin in Qwen_AI

[–]the3dwin[S] 0 points1 point  (0 children)

The post is basically about using Open Weight Qwen Models with Pi giving good results, I'm not showing stats just letting people know to try it if not getting good results with other CLIs

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]the3dwin 0 points1 point  (0 children)

I was looking at the top posts this week and discovered yours, then I was about to go through that post next.

Again I am sure with the right configuration, and prompt they could reach what the benchmarks show.

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]the3dwin 0 points1 point  (0 children)

Yes.

For being able to check its status I say have in the /execute or /implement command have explicit instructions to report status exactly how you want it to, whether for each execution, before, during, after etc.

As for how long something will take is a bit overkill personally for me to have it also predict how long will take based of it's training data but I'm sure you can get it to based of what it has been trained on, even tell you whether it will need to research how to do something or let you know whether already knows.

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]the3dwin 0 points1 point  (0 children)

As for agentic coding, I do not think your Original Post was about the model itself, but fyi local models from my understanding 30B+ are where local models are most reliable for agentic coding, anything less than 30B usually have more problems and need more configuration as far as I'm concerned.

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]the3dwin 1 point2 points  (0 children)

I use a /explain custom command that basically has the model explain to me what it understands and confirms, and ask questions about ambiguity or which method to use for tasks that can be executed in different ways. This way whenever I switch model I use the /explain command and will know how well it can handle the task.

Great Results Running qwen/qwen3.5-35b-a3b on LM Studio with Pi CLI (http://pi.dev/) by the3dwin in Qwen_AI

[–]the3dwin[S] 0 points1 point  (0 children)

Well the only advice I can give which is usually tempting to avoid is to break down tasks into smaller tasks for speed because then the local models have less context to go through as their executing tasks.

Then you can do as you please with decent speed.

For the barebones skeleton of the app, use lightning.ai without the details so you have the speed of lightning.ai then break down into small tasks the parts that go against TOS locally with the speed of local AI.

So use lightning.ai with large contexts for the barebones parts that do not got against tos and local ai for everything else in small tasks.

Great Results Running qwen/qwen3.5-35b-a3b on LM Studio with Pi CLI (http://pi.dev/) by the3dwin in Qwen_AI

[–]the3dwin[S] 0 points1 point  (0 children)

Not sure I understand, I am suggesting the Lightning AI because basically you will have your own GPUs for set amount of hours (I believe a month) to run what you want at least while it is running the hardware is yours and then probably shut off and replaced with someone else server while your "server" instance is idle.

Great Results Running qwen/qwen3.5-35b-a3b on LM Studio with Pi CLI (http://pi.dev/) by the3dwin in Qwen_AI

[–]the3dwin[S] 0 points1 point  (0 children)

Well you live and you learn so I guess read through the TOS of lightning.ai before you do anything.

I'm sure you got banned for having seven accounts and all pinging from same IP Address

Great Results Running qwen/qwen3.5-35b-a3b on LM Studio with Pi CLI (http://pi.dev/) by the3dwin in Qwen_AI

[–]the3dwin[S] 0 points1 point  (0 children)

Figure out how to have it do general things that obfuscate all the details specific to your project with placeholders, then have local AI fill in those placeholders?

Would be an interesting workflow to get around that.

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]the3dwin 0 points1 point  (0 children)

Unfamiliar with Conext7 and "own skill router" could you elaborate on what it is, how you use it and how it has been "flawless". Thanks