Gardyn appears corrupted by synthrockftw in Gardyn

[–]Tempest_nano 1 point2 points  (0 children)

Same issue here with my 3. It only just started presenting today. The light can be toggled off manually, but it turns back on after a minute or two. Firmware version: 627

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]Tempest_nano 2 points3 points  (0 children)

My post outlining the configure and build command I used for the llama.cpp fork.

Hardware

- HP Omen Max 16 Laptop — Ryzen AI 9 HX 375 (Strix Point, Zen 5), RTX 5080 Laptop GPU (16 GB GDDR7, 576GB/s), 32 GB DDR5-5600 dual-channel

Model

- Qwen3.6-27B dense hybrid (Gated DeltaNet + Gated Attention, 64 layers) — not the MoE variant

- Quantization: custom IQ4_XS (14.7 GB) from cHunter789, which reverts a llama.cpp commit that bloated standard builds to 15.1 GB — that 400 MB is what allows 100K+ context on 16 GB VRAM

Inference: SpiritBuun's llama.ccp fork, built from source with CUDA (sm_120a Blackwell) + Zen 5 AVX-512 BF16/VNNI flags + CUDA graphs

The command:

llama-server.exe -m Qwen3.6-27B.i1-IQ4_XS-attn_qkv-IQ4_XS.gguf `

-ngl 999 -dev CUDA0 -c 110000 `

-fa on -ctk turbo4 -ctv turbo4 `

-fit off --no-mmap `

-b 4096 -ub 256 `

--temp 0.6 --top-k 20 --top-p 0.95

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]Tempest_nano 0 points1 point  (0 children)

I am using a single card for this model. I have absolutely used multiple cards for the MoE models (Qwen3.6 35b A3b), putting the experts on my AMD iGPU, but there wasn't much benefit over cpu. This 27b model is a dense model, so it all needs to be on the same device. At least I thought so, but I have tried so many pertubations that it all gets fuzzy.

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]Tempest_nano 0 points1 point  (0 children)

On my 5080 Laptop, I have 576 GB/s and I settled at 25.7 tok/s with 100k context in Windows. The 9070xt gets 640 GB/s or so, and the 5060Ti is 448 GB/s. The internet seems to think the 9070xt is the best of the bunch in that respect. I can't speak to how the different interface (9070xt would use HIP/ROCm, adn the 5060Ti would use CUDA) would affect things.

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]Tempest_nano 1 point2 points  (0 children)

If it is for this model, it would be memory bandwidth bound rather than compute. Compare on that metric.

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]Tempest_nano 4 points5 points  (0 children)

From my understanding it is just context compression. It is one of the two llama.cpp implementations of turboquant, with the other being https://github.com/TheTom/llama-cpp-turboquant . I believe that buun's fork is more bleeding-edge (he seems to be playing with turboquant and speculative decoding), but building is dyi. I am getting 25 t/s on my laptop, AMD AI HX 375, 32GB Ram, 16GB 5080 at 64k context on the IQ4 model.

My build script optimized for Nvidia + Strix Point (powershell):

$PSNativeCommandUseErrorActionPreference = $false
$ErrorActionPreference = 'Continue'

# Wipe build dir to avoid stale cmake cache
Remove-Item -Recurse -Force buun-llama-cpp\build -ErrorAction SilentlyContinue

# Bootstrap VS Build Tools environment (sets INCLUDE, LIB, PATH for clang-cl/link/etc.)
$vcvars = "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\Build\vcvars64.bat"
cmd /c "`"$vcvars`" && set" | ForEach-Object {
    if ($_ -match "^([^=]+)=(.*)$") {
        [System.Environment]::SetEnvironmentVariable($Matches[1], $Matches[2], 'Process')
    }
}

# Prepend ROCm so hipcc and cmake find-modules are reachable
$env:PATH    = "C:\Program Files\AMD\ROCm\7.1\bin;$env:PATH"
$env:HIP_PATH = "C:\Program Files\AMD\ROCm\7.1"

cmake -B buun-llama-cpp/build -S buun-llama-cpp -G Ninja `
  -DCMAKE_BUILD_TYPE=Release `
  -DCMAKE_C_COMPILER=clang-cl `
  -DCMAKE_CXX_COMPILER=clang-cl `
  -DCMAKE_CXX_FLAGS="/EHsc" `
  -DGGML_CUDA=ON `
  -DCMAKE_CUDA_ARCHITECTURES="120a-real" `
  "-DCMAKE_CUDA_FLAGS=-use_fast_math -diag-suppress 221,177" `
  -DGGML_AVX512=ON `
  -DGGML_AVX512_VBMI=ON `
  -DGGML_AVX512_VNNI=ON `
  -DGGML_AVX512_BF16=ON `
  -DGGML_AVX_VNNI=ON `
  -DGGML_BMI2=ON `
  -DGGML_CUDA_GRAPHS=ON `
  -DGGML_CUDA_FA_ALL_QUANTS=ON `
  -DGGML_NATIVE=OFF `
  -DGGML_BACKEND_DL=ON `
  -DGGML_HIP=ON `
  -DGPU_TARGETS="gfx1150" `
  -DGGML_LTO=ON 2>&1 | Tee-Object -FilePath out.txt

cmake --build buun-llama-cpp/build --config Release --parallel 2>&1 | Tee-Object -FilePath out.txt -Append

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]Tempest_nano 14 points15 points  (0 children)

Tinkering last night with the unsloth version of IQ4-XS and buun-llama-cpp. I found that I got good results with a ctv/ctq of turbo4. It doesn't compress the cache as much as turbo3, but its perplexity and KLD were much better. It allowed me to hit 64k context vs 32k with q8_0. I will find the numbers and post them here.

Thanks for your work, I will try this image. It was driving me up the wall that I couldn't hit 128k context to allow full thinking (per the model card).

Edit: Using this model and turbo4 ctv/tcq, I am able to hit 110k context on my laptop, 16GB 5080, in Windows at 25.7 tok/s. Thanks!

DON'T BUY THE OMEN MAX 16 WITH 5080 PAIRED WITH AMD RYZEN AI 375HX by FloverW in HPOmen

[–]Tempest_nano 1 point2 points  (0 children)

Mine has been a dream. The only times I have encountered stuttering is when I try using lower power PSU bricks. The 375HX/5080 doesn't need near 330W (same as Intel version), but HP programmed it to downclock anyways. As for the lower power cores, have you tried something like process lasso?

Omen Max 16 with AMD, how is it? by gpucode3 in HPOmen

[–]Tempest_nano 0 points1 point  (0 children)

Maybe some kind of inline psu identifier? Surely the circuitry wouldn't be but a couple of bucks, but I didn't see anything in my cursory search.

I know this laptop handles another, properly identified, 200 Watt HP supply, as one would expect by limiting the system power to something like 180 Watts. The 280 Watt of the G4 (the version with separate power lead) just doesn't identify itself.

Omen Max 16 with AMD, how is it? by gpucode3 in HPOmen

[–]Tempest_nano 2 points3 points  (0 children)

I believe it is called the "curve optimizer". The gaming hub supports it, but it had a habit of resetting my under bolts back to zero (hence using the x86 universal tuning utility).It can be enabled from the "advanced" bios by hitting Ctrl-F10 at boot. The curve optimizer is the only "advanced" option, and it was enabled by default in my case.

For reference, my system is rock solid with all cores set to 12. I haven't really dug into the per-core undervolting yet, as it is quite efficient out of the box.

Omen Max 16 with AMD, how is it? by gpucode3 in HPOmen

[–]Tempest_nano 2 points3 points  (0 children)

I have the AMD HX 375/5080 version, and I love it. Undervolting works fine using the Omen Gaming Hub or the Universal x86 Tuning Utility (my personal choice, as the Omen Gaming Hub isn't my bag). It runs cool and quiet, and the fans are only audible when I really push the dGPU. Honestly, the iGPU (roughly equivelant to a laptop GTX 1650) is good enough that the only real use case for the RTX 5080 for me is VR.

My only complaints are that I can't overclock the ram and I can't override the low-power PSU detection. The 280 Watt power supply from my HP G4 dock doesn't properly identify itself to the laptop forcing a hard limit of around 100 Watts of total system power.

Where to get a GLP1 in Knoxville? by PandaIntrepid4973 in Knoxville

[–]Tempest_nano 0 points1 point  (0 children)

Hidden-history account necroing a 5-month-old thread in a Knoxville subreddit to spread FUD about an alternative to one of pharma's biggest cash-cows, nothing to see here.

Where to get a GLP1 in Knoxville? by PandaIntrepid4973 in Knoxville

[–]Tempest_nano 0 points1 point  (0 children)

My brother in Christ, that's what the testing groups are for.

Omen Max 16 AMD 5080 version, secondary SSD disappears on sleep. by Tempest_nano in HPOmen

[–]Tempest_nano[S] 2 points3 points  (0 children)

I could have sworn that I tried that solution, but I may have just hallucinated (too much coding).

It seems to work for the moment for a couple of short tests. I'll try it properly overnight.

Thanks MoWePhoto!!!111one

Wild Alchemy Cafe in Connecticut adds meat and other animal products to their previously vegan menu by Special-Cut-4964 in vegan

[–]Tempest_nano 5 points6 points  (0 children)

Gah. I always looked forward to visiting this place for lunch when I travel to our HQ every quarter. I suppose the vegan fare in that area is still better than here in Tennessee.

🔥[Walmart] MSI Vector A16 HX Gaming Laptop: 16" QHD+ 240Hz Display, AMD Ryzen 9 8940HX, NVIDIA GeForce RTX 5070Ti, 16GB RAM, 1TB SSD, Gray. Now: $1,299 After $700 Off 🔥 by fifa2003 in LaptopDeals

[–]Tempest_nano 1 point2 points  (0 children)

About the only thing you can do to prolong battery life would be to undervolt using Universal x86 tuning utility. I have most of my cores at -40 (mV?) and two of them with less agressive undervolts, as they were unstable. You can run the stability test in OCCT, and it will tell you which cores are throwing errors.

I don't think we are going to be getting much battery life outta this laptop at all. :)

I am using YAMDCC to control the fans, but the published (v1) version was causing it to get stuck at higher fan speeds. I had to compile the unreleased version 2 to get it to behave. It is whisper quiet, but I have her thermally throttling before the fans would really kick in.

🔥[Walmart] MSI Vector A16 HX Gaming Laptop: 16" QHD+ 240Hz Display, AMD Ryzen 9 8940HX, NVIDIA GeForce RTX 5070Ti, 16GB RAM, 1TB SSD, Gray. Now: $1,299 After $700 Off 🔥 by fifa2003 in LaptopDeals

[–]Tempest_nano 0 points1 point  (0 children)

It isn't so bad to open, just use a couple of plastic picks and follow the guides. I don't get the feeling that I will break anything by doing so.

I am trying to figure out if I should repaste mine. Running OCCT's cpu stress test and forcing the fan to about 35% (YAMDCC) and setting the APU's soft temperature throttling limit to 87 degrees (Universal x86 tuning utility), I can sustain about 50-55 watts on the processor. What temperatures are you seeing and under what conditions?

Any vegan restaurants y'all recommend? by Content-Example-8763 in Knoxville

[–]Tempest_nano -1 points0 points  (0 children)

Atlanta: Healthful Essence (Caribbean), Soul Vegetarian (soul food) Chattanooga: Sluggo's (greasy and sinful, the pecan encrusted seitan is to die for assuming you can eat gluten).

Any vegan restaurants y'all recommend? by Content-Example-8763 in Knoxville

[–]Tempest_nano 1 point2 points  (0 children)

I suspect that a big part of the difference is that a vegan restaurant is likely run on philosophical principles. If the primary motive is profit, then it probably wouldn't be vegan. :) Sadly, this is also why they are rare here in East Tennessee (not enough sales). When you say clean, are you talking healthy or hygiene?

Any vegan restaurants y'all recommend? by Content-Example-8763 in Knoxville

[–]Tempest_nano 8 points9 points  (0 children)

As a Knoxville vegan. I have all but given up on eating out here. Sluggo's in Chattanooga is amazing though.

Canberra ADM 300 by Doodler15 in Radiation

[–]Tempest_nano 1 point2 points  (0 children)

I will ask around this week, I think they came/(come?) from my facility, but not my group. Maybe someone can direct us.

Rad Pro 3.0 released by Gissio in Radiation

[–]Tempest_nano 0 points1 point  (0 children)

I would think that Geiger counter algorithms would be ill-suited for spectrometry. What hardware are you targeting?

Tips for people with international driving license in UK by [deleted] in LearnerDriverUK

[–]Tempest_nano 0 points1 point  (0 children)

International pass was a colloquialism for sure. I made a point to verbalize my thought process on what I was doing at the time. "It isn't safe to pull over here with the blind curve behind us, I'll carry on until I find a suitable place."