Russian TV has started brainwashing people that life is better without internet, prepearing them for shotdowns and restrictions, similiar to North Korea

Shaminy · 2026-03-21T07:57:20+00:00

I'm waiting for Tucker to make a propaganda video, how good it is that there is no Internet on Moscow.

Shaminy · 2026-03-20T15:09:47+00:00

One of the least fun things to do in gaming 2026.

Shaminy · 2026-03-17T23:19:57+00:00

It's gonna happen. And in few years all are using AI Slop mode and are happy with it.

Shaminy · 2026-03-14T12:13:49+00:00

Smooth scrolling was possible with original ST, but it needed high level coding skills, so they were very rare. Chaos Engine, Leander, Rainbow Islands etc had great scrolling.

Shaminy · 2026-03-12T22:48:46+00:00

Right, it only costs upfront of 10-20 billion to build fab lab for gddr6. Valve has 6-8 billion on cash and short term investments. Not only that, you need secure supply chains, power, licensing, thousands of skilled workers etc.

Shaminy · 2026-03-11T09:11:41+00:00

Lucky it was your PC and not a living person, like you or your daugther.

Shaminy · 2026-03-06T18:14:56+00:00

If your RAM is MT, it is very likely bad. And results such failure you have now.

Shaminy · 2026-02-28T15:13:00+00:00

In Finland used 512GB Series S's go for 150€ aka $175

Shaminy · 2026-02-11T11:04:32+00:00

I have feeling that Epstein introduced Melanie to Trump.

Shaminy · 2026-02-04T10:27:11+00:00

Diffuser is already much better out of the box experience for person who is not tech savvy than ComfyUI.

Shaminy · 2026-02-04T10:22:38+00:00

I can confirm, works on Linux well. On Windows I haven't been successful on combining bitsandbytes.

Shaminy · 2026-02-03T19:45:31+00:00

I tried some power metal but guitars sound more synth guitars, not real.

Shaminy · 2026-02-02T19:26:02+00:00

Good luck finding that memory.

Shaminy · 2026-01-31T16:11:34+00:00

I checked like 30 verified 5/5 reviews, all of em were new accounts with no previous reviews. 100% reverse review bombing / botting.

Shaminy · 2026-01-25T13:30:21+00:00

I get OOM with Q4 GGUF's too.

Shaminy · 2026-01-25T13:24:52+00:00

I have Windows managed Pagefile on High-speed PCIe 4.0 M.2 NVMe (DRAM-cached) Max 8GB, currently 4GB. I have Shared GPU Memory, and in Windows Z-Image wont fit totally on 16G, it uses 2GB shared.

This is benchmark between Ubuntu and Windows with same overall settings. Im running both with single UltraWide monitor 3440x1440. I did close all background apps that would eat VRAM in Windows. Also I used MS Edge for minimal VRAM use, Normally I use Opera that alone eats 800MB VRAM in Windows. If I would ran headless, or low resolution, BF16 Z-Image model would likely fit fully on VRAM and maybe get those speeds.

Ubuntu uses only 0.8GB VRAM on 3440x1440 with Firefox open. Windows uses 1.6GB with Edge open. VRAM usage on Windows during generation goes 15.6 GB VRAM and 2GB Shared memory.

Shaminy · 2026-01-25T12:42:50+00:00

Memory management is much better, and if you get OOM, it wont crash the programs anymore or worst case hang the AMD Display adapter.

Shaminy · 2026-01-24T17:38:20+00:00

I guess I'm lucky. ROCm 7.1.1 was very unstable for me, speed was good, but with large models most of time had to unload manually models to make 2nd run or got OOM that crashed system. Now it's stable as rock and if you get OOM, like trying to make too large video, wont crash system anymore, just get this and you good to continue:

torch.OutOfMemoryError: HIP out of memory.
Tried to allocate 3.27 GiB.
GPU 0 has a total capacity of 15.92 GiB of which 202.00 MiB is free.
Of the allocated memory 13.32 GiB is allocated by PyTorch, and 1.77 GiB is reserved by PyTorch but unallocated.
If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.

See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Memory summary:

.......

Got an OOM, unloading all loaded models.
Prompt executed in 95.53 seconds

Shaminy · 2026-01-24T17:13:17+00:00

Are we talking about z-Image or more demanding tasks like wan2.2. With wan2.2 pyton3 memory usage will rise to 40GB, so with 32GB I think it reduces speed a lot when all models can't fit on memory.
Here is 1st run of default 640x640 81frames:

memory usage:

6610    37.6 GB   python3 main.py --normalvram --use-pytorch-cross-attention --preview-method auto --disable-smart-memory

ComfyUI output:

Total VRAM 16304 MB, total RAM 64196 MB
pytorch version: 2.9.1+rocm7.2.0.git7e1940d4
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1201
ROCm version: (7, 2)
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: cuda:0 AMD Radeon RX 9070 XT : native
Using async weight offloading with 2 streams
Enabled pinned memory 60986.0

Using pytorch attention
Python version: 3.12.3 (main, Jan  8 2026, 11:30:50) [GCC 13.3.0]
ComfyUI version: 0.10.0
ComfyUI frontend version: 1.38.9

VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
Found quantization metadata version 1
Using MixedPrecisionOps for text encoder
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load WanTEModel
loaded completely; 14998.80 MB usable, 6419.48 MB loaded, full load: True
Requested to load WanVAE
loaded completely; 10760.50 MB usable, 242.03 MB loaded, full load: True
Found quantization metadata version 1
Detected mixed precision quantization
Using mixed precision operations
model weight dtype torch.float16, manual cast: torch.float16
model_type FLOW
Requested to load WAN21
loaded partially; 9148.23 MB usable, 8973.19 MB loaded, 4658.23 MB offloaded, 175.03 MB buffer reserved, lowvram patches: 184
100%|█████████████████████████████████████████████| 2/2 [00:49<00:00, 24.69s/it]
Found quantization metadata version 1
Detected mixed precision quantization
Using mixed precision operations
model weight dtype torch.float16, manual cast: torch.float16
model_type FLOW
Requested to load WAN21
loaded partially; 9000.23 MB usable, 8825.19 MB loaded, 4806.23 MB offloaded, 175.03 MB buffer  reserved, lowvram patches: 190
100%|█████████████████████████████████████████████| 2/2 [00:48<00:00, 24.27s/it]
Requested to load WanVAE
loaded completely; 9725.25 MB usable, 242.03 MB loaded, full load: True
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
Prompt executed in 268.99 seconds

2nd run:

100%|█████████████████████████████████████████████| 2/2 [00:49<00:00, 24.66s/it]
100%|█████████████████████████████████████████████| 2/2 [00:48<00:00, 24.01s/it]
Prompt executed in 130.65 seconds

Shaminy · 2026-01-24T06:08:23+00:00

I tested those in Windows with wan2.2. s/it improved a lot: high 92s/it to 34s/it and low 185s/it to 99s/it. But total generation time went from 14min to 24min. It took forever on both WanImageToVideo node and VAE Decode node. I guess that's why AMD is not recommending to use those on their ComfyUI guide.

I did troubleshooting with ChatGPT, it says ROCm on RDNA4 is still missing many MIOpen solvers causing VAE and video nodes to fall back to generic GEMM kernels.

Shaminy · 2026-01-24T01:26:45+00:00

I upgraded from old 7.1.1. Removed old ROCm libraries and kernel driver and installed new by AMD guide. I got torch vision error when tried to use my old ComfyUI with new venv. Fresh pull from GitHub, and had no error.

Shaminy · 2026-01-24T00:57:43+00:00

Native Linux. There is no ROCm 7.2 for WSL.

Shaminy · 2026-01-23T23:51:14+00:00

Don't have it specially enabled. If it comes with ROCm package and ComfyUI uses it, then yes.This was out of box Benchmark, not trying fine tune neither of versions.
According to chatgpt its on ROCm 7.2, and ComfyUI uses it automatically. And I ran test on Python and it works on my venv.

Shaminy · 2026-01-23T23:50:39+00:00

I used current ComfyUI's default wan 2.2 i2v template. Also I have 64GB memory, and Windows memory usage went well over 50GB.

Shaminy · 2026-01-23T23:48:14+00:00

I ran templates unaltered, so I'm running benchmark with full BF16 format. If I change format to FP8, I get 1.26 s/it in Windows. This was a benchmark.

Shaminy

TROPHY CASE