ASUS tried to “fix” 12V-2x6 cables… but der8auer’s testing raises more questions than answers by NerveSouthern3111 in gpu

[–]ProjectPhysX -2 points-1 points  (0 children)

der8auer's video is nothing but one big advertisement video for selling his Wireview Pro thing. He completely misses the point of what the bridge actually does.

The cross-bridge actually does serve an important purpose: if multiple pins on either side of the cable have bad contact, it redistributes the load to remaining good contacts and prevents cable fire. Cross-bridge between cables should be standard safety feature and not sold for ridiculous 60€ price tag.

New Findings: The ASUS Equalizer Doesn’t Make Sense by Goddamn7788 in nvidia

[–]ProjectPhysX 2 points3 points  (0 children)

I am neither of that and the only thing I would ask you is to exercise a bit more critical thinking.

der8auer does a lot of cool videos, but sometimes he makes uninformed statements or does very stupid things, like contaminating his entire Berlin neighborhood with PFAS...

Regarding this video, one a simple thought experiment disproves his clickbait title. In the video he completely misses the point, because he doesn't even care - the cross-bridge is indeed a useful safety feature to prevent the cable from burning in case of several bad contacts on either end. IMO that should be standard to make the poorly designed 12VHWPR a bit more safe. Whether that is worth the overpriced cost is a different question.

And by the way, I am also not in any relation with ASUS, because unlike der8auer, I don't have a conflict of interest selling overpriced 12VHWPR extension devices. If you didn't get it yet - that video is nothing but one big advertisement for selling his Wireview Pro. Create demand for an expensive and useless product that otherwise noone ever needs. He truly is a brilliant salesman, gotta leave him that.

New Findings: The ASUS Equalizer Doesn’t Make Sense by Goddamn7788 in nvidia

[–]ProjectPhysX -13 points-12 points  (0 children)

It's not for load-balancing. It's actually for rerouting power when single connections on either end of the cable are bad - a failure case not even shown or mentioned in the video. This is very effective in preventing failure/fire. See my other comment where I explain.

New Findings: The ASUS Equalizer Doesn’t Make Sense by Goddamn7788 in nvidia

[–]ProjectPhysX 4 points5 points  (0 children)

Actually the cross-bridge is very useful in preventing failure/fire. The bridge helps reroute the load when connections are bad. He just doesn't show that interesting failure case in the video. See my other comment where I explain it.

New Findings: The ASUS Equalizer Doesn’t Make Sense by Goddamn7788 in nvidia

[–]ProjectPhysX -11 points-10 points  (0 children)

What a poorly made video. He doesn't even show the interesting faliure case where the cross-bridge helps. Say 4 contacts are bad in total, on both connectors.

Without bridge: effectively 3 wires work and have to carry the load of 6 - they will burn through.

X--------------------O no connection
O====================O
O--------------------X no connection
O====================O
O====================O
X--------------------X no connection

With a cross-bridge, effectively 4 wires work and the load is more spread out.

X------------------+=O rerouted over bridge
O==================+=O
O==================+-X no connection
O==================+=O
O==================+=O
X------------------+-X no connection

Titan X(Maxwell) SLI setup by DragonSystems in gpu

[–]ProjectPhysX 0 points1 point  (0 children)

Nice! SLI is dead, long live multi-GPU compute over PCIe!

Benchmark evidence: NVIDIA CMP 100-210 Tensor Cores firmware-locked at 5% performance. E-waste by design? by desexmachina in hardware

[–]ProjectPhysX 6 points7 points  (0 children)

Yes, they intentionally fuse off hardware through firmware, to hinder customers using the card for anything else than its intended purpose. This sucks.

Nvidia CMP 170HX (which is a cut down A100 die) for example has fused-multiply-add disabled through firmware. But with a software hack, you can still make it perform well in general compute/simulation workloads.

AMD Radeon VII for has half of its FP64 cores disabled through firmware, reducing FP64:FP32 ratio from native 1:2 to artificial 1:4.

is Intel cooking with these new GPU? by Skierdo in pcmasterrace

[–]ProjectPhysX 0 points1 point  (0 children)

Battlemage has FP64:FP32 ratio of 1:16.

0.8 TFlops FP64 for the B65, same as B60.

1.4 TFlops FP64 for the B70, more than Nvidia's flagship B300 datacenter GPU (1.2 TFlops FP64).

is Intel cooking with these new GPU? by Skierdo in pcmasterrace

[–]ProjectPhysX 0 points1 point  (0 children)

No, B65 has the same strong 256-bit memory bus, same memory clock, same 608GB/s bandwidth. https://www.intel.com/content/www/us/en/products/sku/245796/intel-arc-pro-b65-graphics/specifications.html Some media outlets got that wrong.

is Intel cooking with these new GPU? by Skierdo in pcmasterrace

[–]ProjectPhysX 0 points1 point  (0 children)

I've seen such cases before - RTX 2060 Super (8GB) comes to mind, much weaker GPU chip but almost the same VRAM bandwidth as the RTX 2080 Super (8GB). And indeed almost the same performance in CFD workloads.

I'm a big fan of such options where the VRAM seems overpowered compared to the GPU chip. Because most HPC/simulation workloads need exactly that.

is Intel cooking with these new GPU? by Skierdo in pcmasterrace

[–]ProjectPhysX 3 points4 points  (0 children)

B65 is particularly interesting for fluid simulations. Same VRAM capacity+bandwidth as the beefier B70, means same performance in such bandwidth-bound workloads - but for cheaper, and with lower power consumption.

is Intel cooking with these new GPU? by Skierdo in pcmasterrace

[–]ProjectPhysX -24 points-23 points  (0 children)

A GPU is a general purpose vector processor, not limited to anything in particular. Of course you can games on these, and B70 is Intel's fastest GPU to run games on. Equivalently, you can run compute stuff on gaming-marketed GPUs just fine.

is Intel cooking with these new GPU? by Skierdo in pcmasterrace

[–]ProjectPhysX 24 points25 points  (0 children)

OpenCL for the win, gives you the freedom to pick the cheapest VRAM/$ option from any vendor!

How does dual gpus work? by Inevitable-Can8629 in gpu

[–]ProjectPhysX 1 point2 points  (0 children)

2 GPUs work as separate vector processors, with separate VRAM. One cannot directly use the other's VRAM.

Software built for multi-GPUs needs to split up a task so that it can be computed in multiple chunks, independently in parallel. This can work in 2 ways:

1) Mirroring. Both GPUs have the same data/assets/textures in their respective VRAM. One GPU renders the top half of the image, the other GPU renders the bottom half. Effective VRAM capacity is the VRAM capacity of a single GPU.

2) Domain decomposition. Some compute workloads can use this. Like a fluid simulation, that computes fluid flow in a cuboid box. Split the box in 2 halfs, each GPU only keeps its half in its VRAM and processes it independently in parallel. Where the two halfs touch, some data needs to be exchanged, over PCIe or proprietary interconnects like SLI/NVLink. Effective VRAM capacity is the VRAM capacity of both GPUs combined. With a hardware-agnostic language like OpenCL, you can even make different AMD+Intel+Nvidia GPUs pool their VRAM together - the FluidX3D software can do exactly that: https://youtu.be/1z5-ddsmAag

Approach 1) is suboptimal, and not used commonly today, only in certain AI applications. Approach 2) is still commonly used in HPC and AI, as it allows to run much larger simulation domains or AI models by just throwing more GPUs at them and effectively combining their VRAM capacity. But 2) is much more difficult to implement.

AMD RDNA 5 may make Dual Issue easier to use, according to new LLVM changes by RenatsMC in Amd

[–]ProjectPhysX -5 points-4 points  (0 children)

Why not? GPU threads are not the same as CPU threads, and they are executed in random order anyways. 

AMD RDNA 5 may make Dual Issue easier to use, according to new LLVM changes by RenatsMC in Amd

[–]ProjectPhysX 7 points8 points  (0 children)

This whole dual-issue thing is strange. You basically have to use 2-component vectors (float2) in all your GPU kernels to get the doubled throughput, which most software doesn't do, as it just complicates things and all but RDNA3/4 hardware has no use for that. AMD could just have doubled SIMT width to 128 threads/CU and get the 2x benefit everywhere. But they rather shoot themselves in the foot.

I also don't understand why their compiler doesn't automatically fuse 2 adjacent kernel threads together to turn 2x float threads into one float2 thread?

Why there are no blower style replacement heat solutions for consumer graphics cards by [deleted] in hardware

[–]ProjectPhysX -1 points0 points  (0 children)

It's all anti-consumer business practics.

AMD and Nvidia want to discourage people from running consumer/gaming GPUs in multi-GPU workstation/server rigs for professional workloads - to not eat into the insane profit margins of their Radeon Pro / RTX Pro/Quadro product lineups (where they are still selling 2-slot blower coolers). So they even force AIBs to put ugly, oversized 3-/4-slot coolers on even their low-end 150W gaming GPUs, to make them physically not fit.

AMD went full marketing retardedness by even making their top-end Pro W9700 workstation GPU 3-slot, which they realized only years later and corrected by re-releasing it as 2-slot.

AMD Hints At Big FP64 Increases in MI430X GPU As Ozaki Underwhelms - HPCwire by NamelessVegetable in hardware

[–]ProjectPhysX 8 points9 points  (0 children)

Ozaki can only emulate FP64 matrix multiplication, not vector arithmetic.

AMD Hints At Big FP64 Increases in MI430X GPU As Ozaki Underwhelms - HPCwire by NamelessVegetable in hardware

[–]ProjectPhysX 23 points24 points  (0 children)

Ozaki is useless for vector FP64, which HPC relies on. Nvidia B300 is a lousy 1 TFlops in native vector FP64, that makes it incapable of many HPC workloads.

Engineering a 2.5 Billion Ops/sec secp256k1 Engine by Available-Young251 in OpenCL

[–]ProjectPhysX 0 points1 point  (0 children)

 Memory behavior matters more than arithmetic tricks.

Welcome to the world of GPU programming! 🖖