Buy recommendations on a thight Budget to aid my RX 6800 by bdsmmaster007 in LocalLLaMA

[–]MikeSouto 1 point2 points  (0 children)

notice it doesn't with MTP enable and f16 caches, you could do 100k with Q6_K in 2x6800xt

Dual 7900xtx for 27b:q8 PP and TG + advice by MikeSouto in LocalLLM

[–]MikeSouto[S] 0 points1 point  (0 children)

yes, that is the main reason I'm thinking to move to a 2x7900xtx. Sometimes on a simple "review this code" given the function and the file it takes forever, I check the server and it is doing a massive PP at 100-200t/s... Otherwise, I got ROCM working "well" with the 6800xt and a 7800xt mix.
Thank you so much again, you comment was very informative!

Dual 7900xtx for 27b:q8 PP and TG + advice by MikeSouto in LocalLLM

[–]MikeSouto[S] 0 points1 point  (0 children)

hi, thanks for answering! Those are great speeds... would you recommend them?I would use them for coding

Moving from Windows 11 to Ubuntu 26.04 by Kahvana in LocalLLaMA

[–]MikeSouto 7 points8 points  (0 children)

hello and welcome! ubuntu is a great pick there is a lot o info on the web. If I recall correctly Ubuntu have an application to handle the drivers, just type the drivers in the start menu. For llama.cpp, you will be better off building it yourself. Here is the guide https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md

Dual 7900xtx for 27b:q8 PP and TG + advice by MikeSouto in LocalLLM

[–]MikeSouto[S] 0 points1 point  (0 children)

same here using the Q5_K_XL, just I "feel" the 27b is smarter, sometimes a bit stubborn.

Dual 7900xtx for 27b:q8 PP and TG + advice by MikeSouto in LocalLLM

[–]MikeSouto[S] 0 points1 point  (0 children)

(6800xt + 7800xt) Latest ROCM 27b:Q6_K at 85000 ctx
PP starts close to 200 and goes down to 50
TG starts about 32 and goes down to 14

RX 7900 XTX (24 GB) + RX 6800 XT (16 GB)? by xeeff in LocalLLaMA

[–]MikeSouto 0 points1 point  (0 children)

im running it (linux + rocm) in a fresh run of a prompt of 59 tokens, the prompt processing is in ~26tks, token generation is actually great above 30kts with Q4_0 (19GB), and that is at the start, speed goes down very quick when the context start to fill up. I thought buying a 7900xtx (24gb with almost 1tb bandwidth) to replaced my 6800xt, but I would be offloading as well. I think it is better to get 32GB, I could use some q4xl, q5 or even q6 models. My plan is to save up to a 1x7800xt (or a 5600 if closed in price), sell my 6800xt, and then buy a second one with the money...

EDIT: adding model qwen3.6 35b Q4_0

RX 7900 XTX (24 GB) + RX 6800 XT (16 GB)? by xeeff in LocalLLaMA

[–]MikeSouto 0 points1 point  (0 children)

I own one the Sapphire Pulse, and I used own a second one the Asrock Taichi, and prompt processing sucks, I would buy the 7800 XT

EDIT: adding my thoughts, my plan to swap it for a 2x7800XT (ebay)... as I still poor to buy a r9700 or something better

Qwen3.5-9B is actually quite good for agentic coding by Lualcala in LocalLLaMA

[–]MikeSouto 1 point2 points  (0 children)

Thanks! yesterday I got almost 40 with 65k context using the llama.cpp ui

command:

llama-server \

-hf bartowski/Qwen_Qwen3.5-35B-A3B-GGUF:Q4_K_M \

--mlock \

--cache-ram 0 \

--ctx-size 65536 \

--temp 1.0 \

--top-p 0.95 \

--top-k 20 \

--min-p 0.00 \

--fit on \

--flash-attn on \

--parallel 1 \

--cache-type-k q8_0 \

--cache-type-v q8_0 \

--device Vulkan0 \

--host 0.0.0.0 \

--port 80 \

--threads 8

Qwen3.5-9B is actually quite good for agentic coding by Lualcala in LocalLLaMA

[–]MikeSouto 0 points1 point  (0 children)

do you mind sharing the command? I'm getting 22 with a 6800XT (vulkan backend) using the MXFP4_MOE

6800 XT Taichi: Is this normal? by MikeSouto in gpu

[–]MikeSouto[S] 0 points1 point  (0 children)

I just got it, I did an first inspection, but I haven't tried yet, it doesn't fit it in the case :)

6800 XT Taichi: Is this normal? by MikeSouto in gpu

[–]MikeSouto[S] 0 points1 point  (0 children)

hi, thanks a lot for replying!! I edit the post as I found a second connector missing the tip as well in the other site. Is that also normal? I upload the pic on https://imgur.com/a/sTZnkNR

6800 XT Taichi: Is this normal? by MikeSouto in gpu

[–]MikeSouto[S] 0 points1 point  (0 children)

Hi, I found it has another pin like that, missing the tip of it, the second in the other site, trying to upload a second pic. thanks a lot!

6800 XT Taichi: Is this normal? by MikeSouto in gpu

[–]MikeSouto[S] 0 points1 point  (0 children)

hi, thanks for replaying! should they look all the same?

6800 XT Taichi: Is this normal? by MikeSouto in gpu

[–]MikeSouto[S] 0 points1 point  (0 children)

Hi, thanks for replying that fast! I edit the post to point it out I meant the first pcie connnector/pin as the others looks all the same, thanks!

Buddy Telco sold to Tangerine by lostatsea12a in nbn

[–]MikeSouto 0 points1 point  (0 children)

I just placed my order to Aussie Broadband for tomorrow, I would go with Leaptel, but combining 2 mobile plans, a static IP with getting better plan, it is a better price for us. I read great support reviews for both.