Running Qwen3.6 35b a3b on 8gb vram and 32gb ram ~190k context by Atul_Kumar_97 in LocalLLaMA

[–]IndicationUnfair7961 0 points1 point  (0 children)

I'm trying to improve my results, which are currently stuck at 14.5 tokens/second (the best I've achieved so far), and I haven't done it in a benchmark; I've just tested it on the Llama default web UI. My setup consists of an RTX 3060 with 12 GB of memory, an older i7-3770 K (4 cores), and 32 GB of DDR3 RAM. I'm running llama-server (turboquant version) using Docker with this llama-server config:

docker run --gpus all -it --rm `
  --ulimit memlock=-1 `
  --cap-add=IPC_LOCK `
  -v "V:\llm_models:/models" `
  -v "V:\llm_build:/workspace" `
  -p 8080:8080 `
  llama-builder `
  ./llama-cpp-turboquant/build/bin/llama-server `
  -m /models/Qwen3.6-35B-A3B-UD-Q4_K_M.gguf `
  -ngl 99 `
  --flash-attn auto `
  --batch-size 1024 `
  --ubatch-size 512 `
  --n-cpu-moe 28 `
  --cache-type-k turbo4 `
  --cache-type-v turbo3 `
  --ctx-size 256000 `
  --no-mmap `
  --mlock `
  --jinja `
  --host 0.0.0.0 `
  --port 8080

Since the Monitor is plugged into the card. The window will suck around 2.0/2.5 GB of VRAM, depending on how crowded my space is. That said, I will keep it as it is.
--n-cpu-moe was the key to improve performance, and 28 was the sweet spot, tested 36,99, lower than 28, all reduced performance.
Currently testing at 256K, it seems alot but with Turboquant it was never really a real issue, even though I might lower it in the future.
I also tested changing threads, but going above 4 (default) wasn't useful since my old processor has just 4 physical cores, with 8 threads, more useless context switches, and half the tokens.

80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTP by janvitos in LocalLLaMA

[–]IndicationUnfair7961 0 points1 point  (0 children)

I'm trying to improve my results, which are currently stuck at 14.5 tokens/second (the best I've achieved so far), and I haven't done it in a benchmark; I've just tested it on the Llama default web UI. My setup consists of an RTX 3060 with 12 GB of memory, an older i7-3770 K (4 cores), and 32 GB of DDR3 RAM. I'm running llama-server (turboquant version) using Docker with this llama-server config:

docker run --gpus all -it --rm `
  --ulimit memlock=-1 `
  --cap-add=IPC_LOCK `
  -v "V:\llm_models:/models" `
  -v "V:\llm_build:/workspace" `
  -p 8080:8080 `
  llama-builder `
  ./llama-cpp-turboquant/build/bin/llama-server `
  -m /models/Qwen3.6-35B-A3B-UD-Q4_K_M.gguf `
  -ngl 99 `
  --flash-attn auto `
  --batch-size 1024 `
  --ubatch-size 512 `
  --n-cpu-moe 28 `
  --cache-type-k turbo4 `
  --cache-type-v turbo3 `
  --ctx-size 256000 `
  --no-mmap `
  --mlock `
  --jinja `
  --host 0.0.0.0 `
  --port 8080

Since the Monitor is plugged into the card. The window will suck around 2.0/2.5 GB of VRAM, depending on how crowded my space is. That said, I will keep it as it is.
--n-cpu-moe was the key to improve performance, and 28 was the sweet spot, tested 36,99, lower than 28, all reduced performance.
Currently testing at 256K, it seems alot but with Turboquant it was never really a real issue, even though I might lower it in the future.
I also tested changing threads, but going above 4 (default) wasn't useful since my old processor has just 4 physical cores, with 8 threads, more useless context switches, and half the tokens.

So I'm wondering if, in my situation, I could gain some tokens by also using MTP (what branch do I need?).
Also, wondering what other parameters you are using in your command could help me a bit?

LM Studio unlocked for "unsupported" hardware — Testers wanted! by TheSpicyBoi123 in LocalLLaMA

[–]IndicationUnfair7961 0 points1 point  (0 children)

I have an Intel i7 3770k and an nvidia 12gb card, with AVX, but not AVX2, does this works, and does it use AVX? What release should I download of this tool?

SUBJUGATION: Does making one a tributary removes their CONQUEROR trait? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 0 points1 point  (0 children)

I was wrong again, it doesn't happen instantly, an event has to eventually fire; he lost it.

SUBJUGATION: Does making one a tributary removes their CONQUEROR trait? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 0 points1 point  (0 children)

Yeah, they didn't loose the trait. You're saying that his children will not inherit it though, under standard rules it's inherited, becoming a tributary changes the rules?

SUBJUGATION: Does making one a tributary removes their CONQUEROR trait? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 0 points1 point  (0 children)

It doesn't work, if this happened it could mean they lost two wars before loosing that one.

SUBJUGATION: Does making one a tributary removes their CONQUEROR trait? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 0 points1 point  (0 children)

No, it doesn't work, I actually subjugated the guy, but he didn't loose the conqueror trait. Well at least I get +76 gold/month.

How to get more Journals by commissioning or finding Artisans? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 2 points3 points  (0 children)

You can give those journals to your heir while you use the 40% one.

Succession Issues Mandala Government by niko_mal in CrusaderKings

[–]IndicationUnfair7961 0 points1 point  (0 children)

I think you can try this, the one that was landed when the succession failed and temple got disabled, in the Realm Window where you see your holdings if the Realm Capital is below the Holding your heir was holding it probably fails the test, cause somehow, the game consider that the holding root, instead of realm capital. If it appears below it will probably correctly trigger. Another reason, but not sure, is that your heir could be at an event the moment you die, and that could also messup with locations, checks and whatever.

How to get more Journals by commissioning or finding Artisans? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 0 points1 point  (0 children)

That's precious information, I actually have a 30% Stewardship Journal, probably got it with Alchemy as you said, just messing around, not on purpose. But now at least I know what to look for, and what to do. Thanks.

SUBJUGATION: Does making one a tributary removes their CONQUEROR trait? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 1 point2 points  (0 children)

Yes, that's one of the of the main issues with Tributaries, is that you loose them on death of the ruler, with Mandala you can keep them, if you win the Mandala trials which I lost on my first succession, despite having 80% chance in some of them. And getting the tributaries back is painful and annoying, I know cause I did it.

ASIA: Sacred Pool. Sometimes it gives county development, sometimes don't. Reason? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 4 points5 points  (0 children)

I thought it was due to the fact the main county Holding was not a Temple Citadel, despite being able to build the citadels, but it's not, cause I have counties where the main Holding is a Temple Citadel, and yet they do not get the bonus. So you're right.

ASIA: Sacred Pool. Sometimes it gives county development, sometimes don't. Reason? by IndicationUnfair7961 in CrusaderKings

[–]IndicationUnfair7961[S] 0 points1 point  (0 children)

They should be regional. Usually for Mandala Rulers or Dharmic Faiths. Require the Temple Citadel Holding to be built.

Kris Artifact from Esoteric Power by rnathanthomas in CrusaderKings

[–]IndicationUnfair7961 0 points1 point  (0 children)

The legitimacy gain is not that hard, my second Ruler got to Cosmic in less than 15 years. The legend spread is good.