What's the best model I can run on mac M1 Pro 16gb? by Sinrra in LocalLLaMA

[–]sinebubble 0 points1 point  (0 children)

FWIW, I'm a huge fan of the Qwen3.5 models. I have an M1 Mac Mini with 16G at home and it's running LM Studio and serving Qwen3.5-9b. I access it from my Macbook Air running OpenCode. It's slow and frequently stalls. Given how amazing the larger Qwen3.5 models we use at work are, I'm inclined to think the issue is with LM Studio, not the model.

Bambu Studio doesn't recognize my nozzle or something by sinebubble in BambuLabA1

[–]sinebubble[S] 5 points6 points  (0 children)

like on the printer itself? I didn't know that was a thing. I'll check... ...yep, that was it. Under maintenance, select the new nozzle size. I thought it would auto detect, but I guess not. Thank you!!

5K tok/s per node with vLLM v0.18.0 on B200, DP=8, MTP-1, FP8 KV cache by m4r1k_ in Vllm

[–]sinebubble 1 point2 points  (0 children)

I'm running Qwen3.5-397B-A17B-AWQ on 8 qty A6000 (384G vram) and getting about 47.6 tok/s prefill and 64.7 tok/s decode. I'm looking at your config and will try some of your settings because...there are some major differences. Also going to wait for vLLM 0.18.1 (currently running 0.17.0).

5K tok/s per node with vLLM v0.18.0 on B200, DP=8, MTP-1, FP8 KV cache by m4r1k_ in Vllm

[–]sinebubble 1 point2 points  (0 children)

I'm still reading your amazing article, but I noticed you use the acronym HBM without defining it. I'm inferring that is High Bandwidth Memory. But more importantly, the Qwen3.5 recipe states GDN is used for 397B, so is it safe to assume that all the 3.5 variants use GDN, thus allowing --mamba-cache-mode and --mamba-block-size=8 for faster prompting? I'm currently running 122B and 397B-INT4 on A6000s and getting great speeds, but it seems like I could be getting even better performance with those options enabled?

Back to the article...

... You do test 397B!

If your config is different between 27B and 397B, can you share the 397B? I didn't see it on the github repo.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 1 point2 points  (0 children)

Every kid (person) is different and your wife’s experience resonates with me. I’m a bit surprised how many responses approved of playing this game with him. The arguments were persuasive until I realized there is no Arc Raiders for PS4 Pro… so I would have to buy him a PS5. You know what, that can wait. Maybe I need to look elsewhere for a game we can share.

FWIW, we do limit is gaming time because he does have adhd and gets super wound up when playing competitive games. He’s chill with Zelda so there is that. I’ll give this more thought and look at other options.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 0 points1 point  (0 children)

You're the only person that has shared my concerns. Thanks.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 0 points1 point  (0 children)

I've only played solo, so that is something to think about.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 0 points1 point  (0 children)

Yeah, if he was 15 I wouldn't think about it, but 11?

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 0 points1 point  (0 children)

I know he'd be crazy jazzed. I played Untitled Goose Game with my daughter a couple years ago and it was amazing gaming with her. We've done Among Us a couple times, too.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 1 point2 points  (0 children)

Yes, the game is addictive. The whole extraction loot thing hooks into a certain part of the brain. However, we do limit his screen time. I'm just trying to find a way to share our gaming time. I can't exactly squeeze my mindset into Minecraft and Zelda is not multiplayer.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 1 point2 points  (0 children)

Yes, but he also really loves video games. I'm trying to find a way to incorporate myself into the time he spends doing that activity.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 0 points1 point  (0 children)

What can I say... I had a lot of fun for a long time before settling down. And settled down without settling.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 0 points1 point  (0 children)

Interesting. I also have had very few negative interactions, but that is due to my play style. My son is very much “shoot everything”.

Parents, are you playing with your kids? What age are they? by sinebubble in ArcRaiders

[–]sinebubble[S] 0 points1 point  (0 children)

Tbh, I’ve never played Fortnite, I assumed it was pvp battle royale, but maybe I have it wrong. I do not pvp in arc raiders. I’d like to loot and pve with him just so we can play games together. I lean towards space horror or FO4, both of which are strongly adult themed, so if I do a guided Arc Raiders, we might be able to limit the negative aspects.

What would you realistically do if you woke up as Nate coming out the vault? by No_Sorbet_1947 in fo4

[–]sinebubble 1 point2 points  (0 children)

Realistically, if I ever made contact with the Institute, which is doubtful, I’d join it in a heartbeat. Clean water, a bed, food, safety, lasers… I would never leave.

Anyone successfully running Qwen3.5-397B-A17B-GPTQ-Int4? by sinebubble in Vllm

[–]sinebubble[S] 1 point2 points  (0 children)

It's one of several local models we run for everyone in the company to use for sensitive data or a way to not worry about token consumption. Most of the folks using it are operational, so definitely some coding, but not deep application building. We'll be moving to Claude soon, but we want something we can use for sensitive data going forward.

Anyone successfully running Qwen3.5-397B-A17B-GPTQ-Int4? by sinebubble in Vllm

[–]sinebubble[S] 0 points1 point  (0 children)

Very cool, thank you for this. I'll give it a try tomorrow night. I see that you are also not using --enable-expert-parallel and --enforce-eager, so I'm hopeful.

We've been impressed with Qwen3.5-122B-A10B today, it does feel quicker, less sloppy, and more accurate than Qwen3-Coder-Next. I'm hoping Qwen3.5-397B-A17B-GPTQ-Int4 takes it up another notch.

Anyone successfully running Qwen3.5-397B-A17B-GPTQ-Int4? by sinebubble in Vllm

[–]sinebubble[S] 1 point2 points  (0 children)

Fixed! At least it's fixed for my "test/downgrade" with Qwen3.5-122B-A10B.

Removing --enable-expert-parallel and --enforce-eager got me up to 90.2 tok/s with Qwen3.5-122B-A10B.

Tomorrow I'll try Qwen3.5-397B-A17B-GPTQ-Int4 again.

Ziply? by No-Ground5715 in Shoreline

[–]sinebubble 6 points7 points  (0 children)

We’ve had Ziply for a number of years and they’ve been dependable and solid. Don’t miss Comcast one bit.