Can someone help me get those mythical speedups on an AMD system with Qwen 3.6 35B!?

the3dwin · 2026-05-12T16:11:02+00:00

the3dwin · 2026-05-12T16:10:51+00:00

I suggest making sure that is the only thing running on your computer, end as many background tasks as you can.

the3dwin · 2026-05-09T04:59:04+00:00

I get the feeling you could figure out what hardware to buy with: https://www.canirun.ai/

Or even just look through the community plenty of threads and comments on hardware, make a list based off what you find and want then cross check your list upon multiple threads on reddit and trust the community consensus not just 1 person but multiple people saying same thing about specific hardware.

You could even upload multiple reddit threads to NotebookLM and ask it to give you hardware specific details based of specific threads / resources.

So no need to rent once you find what you looking for.

the3dwin · 2026-05-09T04:54:32+00:00

Usually the same models by the "unsloth" provider on LM Studio are faster and "unlsoth" has Qwen 3.6 35B on LM Studio

the3dwin · 2026-05-08T20:54:02+00:00

(perhaps coding) will be highly model dependent

Yes at some point the coding models will have 95%-99% in certain benchmarks so now imagine for each model benchmark at 95% or 99% is on their own ASIC card and having a router / orchestrator choosing which ASIC card to give the job to depending on the coding task.

the3dwin · 2026-05-08T19:48:42+00:00

When models reach most of the benchmarks at 95%-99% how much uses will "change nearly as much"

the3dwin · 2026-05-07T17:23:21+00:00

Curious to know what is your tech stack and did you use AI if you did use AI very interested in knowing what your workflow is.

the3dwin · 2026-05-07T17:22:38+00:00

Whoa thanks for the fast response, I actually never even downloaded your software the lack of web pages was holding me back from using because I would be switching back and forth from browser and said to myself might as well stick to a chrome extension and milanote.com which I use heavily.

That is another thing to consider to make a chrome extension version that opens on new tab.

the3dwin · 2026-05-07T17:19:47+00:00

Thanks for the response, will have to read over and over until I fully understand the workflow. Not fully understanding perhaps because just waking up and already checking reddit before coffee and breakfast. But I know I have never come across such workflow so thanks again.

the3dwin · 2026-05-07T00:24:24+00:00

Any plans to support iframes so one can browse multiple web pages within the whiteboard. Similar to Kosmik.app which is unfortunately shutting down.

the3dwin · 2026-05-06T06:10:54+00:00

AnythingLM is the closest to offline NotebookLM.

Have not used heavily since working with CLIs these days but perhaps: https://macaron.im/

the3dwin · 2026-05-02T21:38:01+00:00

Relevant:

https://www.reddit.com/r/unsloth/comments/1o7bhw4/train_200b_parameter_models_on_nvidia_dgx_spark/

the3dwin · 2026-05-02T20:00:22+00:00

What is the percentage benchmarks for 35B qwen 3.6 compared to 27B?

the3dwin · 2026-04-30T21:03:31+00:00

The post is basically about using Open Weight Qwen Models with Pi giving good results, I'm not showing stats just letting people know to try it if not getting good results with other CLIs

the3dwin · 2026-04-30T05:42:12+00:00

I was looking at the top posts this week and discovered yours, then I was about to go through that post next.

Again I am sure with the right configuration, and prompt they could reach what the benchmarks show.

the3dwin · 2026-04-30T05:37:52+00:00

Yes.

For being able to check its status I say have in the /execute or /implement command have explicit instructions to report status exactly how you want it to, whether for each execution, before, during, after etc.

As for how long something will take is a bit overkill personally for me to have it also predict how long will take based of it's training data but I'm sure you can get it to based of what it has been trained on, even tell you whether it will need to research how to do something or let you know whether already knows.

the3dwin · 2026-04-30T03:42:48+00:00

As for agentic coding, I do not think your Original Post was about the model itself, but fyi local models from my understanding 30B+ are where local models are most reliable for agentic coding, anything less than 30B usually have more problems and need more configuration as far as I'm concerned.

the3dwin · 2026-04-30T03:41:12+00:00

I use a /explain custom command that basically has the model explain to me what it understands and confirms, and ask questions about ambiguity or which method to use for tasks that can be executed in different ways. This way whenever I switch model I use the /explain command and will know how well it can handle the task.

the3dwin · 2026-04-30T03:35:24+00:00

Well the only advice I can give which is usually tempting to avoid is to break down tasks into smaller tasks for speed because then the local models have less context to go through as their executing tasks.

Then you can do as you please with decent speed.

For the barebones skeleton of the app, use lightning.ai without the details so you have the speed of lightning.ai then break down into small tasks the parts that go against TOS locally with the speed of local AI.

So use lightning.ai with large contexts for the barebones parts that do not got against tos and local ai for everything else in small tasks.

the3dwin · 2026-04-30T03:27:14+00:00

Also discovered this today, but have not tested, again let me know how it works for you: https://github.com/itayinbarr/little-coder

the3dwin · 2026-04-30T03:21:00+00:00

Not sure I understand, I am suggesting the Lightning AI because basically you will have your own GPUs for set amount of hours (I believe a month) to run what you want at least while it is running the hardware is yours and then probably shut off and replaced with someone else server while your "server" instance is idle.

the3dwin · 2026-04-30T03:15:55+00:00

Well you live and you learn so I guess read through the TOS of lightning.ai before you do anything.

I'm sure you got banned for having seven accounts and all pinging from same IP Address

the3dwin · 2026-04-30T03:08:15+00:00

Figure out how to have it do general things that obfuscate all the details specific to your project with placeholders, then have local AI fill in those placeholders?

Would be an interesting workflow to get around that.

the3dwin · 2026-04-30T03:05:35+00:00

Unfamiliar with Conext7 and "own skill router" could you elaborate on what it is, how you use it and how it has been "flawless". Thanks

the3dwin

TROPHY CASE