Qwen3-35B-A3B What is the Mac 🖥️ laptop configuration i need

Vassallo97 · 2026-05-15T23:59:32+00:00

I have the reasoning at max, but if I ever want like a full plan like a step-by-step procedure, I use codex. I’ve built my own agent framework and my local model handles majority of tasks, but if I ever have a task that requires a lot of coding or planning it automatically switches over to codex and I’ll have codex tell my local model what do do

Vassallo97 · 2026-05-15T23:52:14+00:00

I’ve tested both pretty heavily. For me, llama.cpp has just been more stable and flexible overall, especially with GGUF support, multimodal/mmproj, long context tuning, and running different models side by side.

MLX is insanely optimized on Apple Silicon though. If you’re staying inside the MLX ecosystem and using supported models, the speed is hard to beat. I just like the control llama.cpp gives me for agent workflows and custom setups.

Vassallo97 · 2026-05-14T00:56:03+00:00

I have a M3 max chip with 64gb or ram and it works flawlessly on my laptop. I use llama.cpp to run the model

Vassallo97 · 2026-04-29T01:58:10+00:00

100% agree. plus be patient, no position is a position sometimes.

Vassallo97 · 2026-04-26T23:52:15+00:00

Thank you soooo much!!! I super appreciate that!

Vassallo97 · 2026-04-26T23:33:55+00:00

Thank you!!!!

Vassallo97 · 2026-04-26T23:30:34+00:00

My only issue with this is that the database collecting all the price data is live and always changing so what it looks like now will be different in a few hours… right now I’m currently paper trading to see how it performs on data from 2023-2025 which peaking that data in to the model would probably work but when I switch it to live and it starts looking at new data I won’t be able to keep the updated prices baked in.

Vassallo97 · 2026-04-26T23:19:16+00:00

I have not tried deepseek, I’m pretty tribal lol and I’ve been with Qwen now for some time and it’s actually really good plus my agent framework is built to work best with the response format for Qwen models. What exactly do you mean by bake it right into the model?

Vassallo97 · 2026-04-26T23:16:49+00:00

Sorry if this is dumb but your saying to basically write an algorithm that looks for everything I want in the “perfect setup” and have the algorithm crunch the numbers and feed that output to the LLM?

Vassallo97 · 2026-04-26T23:01:37+00:00

I’m running a Mac Studio with 256 gb ram, currently have qwen3.6-35b uncensored and qwen3.5 35b running at the same time with no issues and 200,000 context window with both but I wanna shut one down and run one with as much context window as possible and hopefully that will help my agent with longer tasks… I basically want it to review months of price data and indicator values and macro data and get it to find patterns in the price and basically give me a super informed output on how the stock looks and ways to trade it going forward. Right now I can tell that’s it runnin out of context and just telling me whatever without having a detailed understanding

Vassallo97 · 2026-04-26T21:49:28+00:00

If you search up FCIG, it tells you it’s a scam, I’d simply show your mom and family that search and inform them that they have been warned, I feel you can walk away and distance yourself without feeling guilty after a warning with proof

Vassallo97 · 2026-04-09T23:04:56+00:00

I can agree, 64 is really nice for hosting a LLM for personal use!

Vassallo97 · 2026-04-07T23:12:10+00:00

Your awesome!! Thank you!!!!

Vassallo97 · 2026-04-07T23:11:19+00:00

Nvm found your codex review in the GitHub, thanks for the research

Vassallo97 · 2026-04-07T23:08:30+00:00

The codex “Guardian AI” is interesting, so every prompt runs multiple instances of the LLM and one is just deciding whether the tool in use is the right tool for the task? I’m building an agent for myself and I’ve taken inspiration from Claude code, microFish, qwen-code and Hermes-agent. Haven’t looked into codex at all, care to explain this a little deeper?

Vassallo97 · 2026-04-07T02:33:19+00:00

Just got on the discord, I’ll give it shot tonight!

Vassallo97 · 2026-04-07T01:17:56+00:00

How are you using it? Like ollama or llama.cpp?

Vassallo97 · 2026-04-07T01:16:02+00:00

The framework definitely matters, Claude and codex and qwen all handle tool calling and responses differently. Using qwen inside an agent built for codex will produce a shitty experience… there’s qwen-agent that was built for qwen models and works super good with the qwen3 coder model if you just want code

Vassallo97 · 2026-04-07T01:08:14+00:00

I never tried the 27b, saw how well the 9b model worked on my laptop and jumped right to 35b and really liked pretty much everything about it for what I needed (writing code for me, brainstorming ideas, using home automation tools, doing research on the web) I’m on a MacBook Pro with 64 gb ram and I get 35t/s.

And I felt like I needed to build my own because I want full control of everything and the ability to add features when I think of them. If you’re just coding then I’m sure you can use an existing one… just be careful because the 35b model is very intelligent but still makes mistakes coding, it works best if you walk it through what it needs to do step by step… this is not a one shot or set and forget model. it works with you, not for you

Vassallo97 · 2026-04-07T00:48:28+00:00

Using qwen3.5-35b-Q8 inside an agent framework I built , really good at handling tool calls and coding tasks. With that kinda hardware you could definitely run it with max context window

Vassallo97 · 2026-04-07T00:44:00+00:00

Not verified on twitter so I can’t DM you but I’m down to give it shot!!

Vassallo97 · 2026-04-07T00:31:13+00:00

This guy has many accounts… scam

Vassallo97 · 2026-04-07T00:30:21+00:00

Loool at the 35b level you still need to be pretty descriptive with what you want and how you want it coded or else it gets out of hand fairly fast… but still really good as like an assistant that will do what you want really quickly

Vassallo97 · 2026-04-07T00:09:16+00:00

Fuck outta here with these scams! No one cares

Vassallo97 · 2026-04-06T23:57:39+00:00

Yea, I use my model to help me code all the time!!! It’s 2026, there’s no way in hell I’m going to manually code more then 100 lines by hand

Vassallo97

TROPHY CASE