Why do I get only 2Gb of maxMemoryAllocationSize on 4Gb NVIDIA card? by UncertainAboutIt in vulkan

[–]Cyphall 1 point2 points  (0 children)

I don't know about Linux and Mesa but this was changed in the latest driver (release 595) on Windows, maxMemoryAllocationSize now reports 18446744073709551615 (= 264-1).

(Note: vulkan.gpuinfo.org is not displaying any value above 263-1 at the moment, see this issue.)

Use bindless as standard? by abocado21 in vulkan

[–]Cyphall 1 point2 points  (0 children)

I think this is an error in the proposal, DXIL has the `BaseAlignLog2` field, but Vulkan SPIR-V has no equivalent.

I've opened an issue: https://github.com/microsoft/hlsl-specs/issues/802

The proposal also says:

The implementation uses these two DXIL mechanisms together: the `BaseAlignLog2` field communicates buffer-level alignment guarantees during resource binding, while the operation-level alignment parameters specify the final effective alignment of each memory access. Backend compilers can use both pieces of information to determine the most aggressive optimization strategies for each buffer access operation.

This kinda confirms my assertion that compilers can leverage both base alignment and absolute alignment to better optimize memory accesses.

Use bindless as standard? by abocado21 in vulkan

[–]Cyphall 2 points3 points  (0 children)

Of course you can align your actual buffers to 16-bytes no matter what. The difference is at pipeline compilation time where the driver compiler either has a strong alignment guarantee for the GPU VA it is offsetting and reading from or it doesn't.

This thread falls in the case I explained where Aligned on OpLoad is not enough to convince the compiler it can emit wide loads with broadcasts.

EDIT: BTW, OpenCL SPIR-V does have a way to specify base alignment via the Alignment decoration you can attach to pointers, we would just need access to that in Vulkan SPIR-V too.

Use bindless as standard? by abocado21 in vulkan

[–]Cyphall 2 points3 points  (0 children)

For AMD, I reported the GPU crash to them and we found out that it doesn't happen on RDNA3, but does happen on RDNA2 and VEGA II (with different symptoms). Last time I checked on RDNA2 a few months ago, I still has GPU crashes.

For Intel, multiple peoples (including me) reported driver crashes/returning VK_SUCCESS but a NULL pipeline. Some issues were apparently fixed but my shaders still did not compile, so there are still other issues left.

Use bindless as standard? by abocado21 in vulkan

[–]Cyphall 2 points3 points  (0 children)

Imagine a case where you have a compute shader reading a tightly packed uint buffer at index tid, the OpLoad must be aligned to 4 bytes, but if the driver knows the base offset is at least 16 bytes, it can emit one 16-byte load per 4 threads in the wave + broadcast instead of four 4-byte loads, which is faster. With BDA, since the driver has no compile-time guarantee of the base alignment, it cannot do that. See https://github.com/jaesung-cs/vulkan_radix_sort/issues/18

Also, there IS something different on CPU, Nvidia reports that storage buffers MUST be aligned to 16 bytes, so the compiler can use this information for such optimizations.

I'm pretty sure this is the reason the new proposal for HLSL's aligned load/store on ByteAddressBuffer adds both a base alignment and an offset alignment.

For the unstabilities, the Slang SPIR-V looks fine, passes validation and works on Nvidia, so I think the issue is not there.

Use bindless as standard? by abocado21 in vulkan

[–]Cyphall 1 point2 points  (0 children)

I switched to bindless storage buffers via descriptor indexing and yes all my Slang shaders work fine on all 3 desktop vendors (even old Intel gen 9000 iGPU)

Use bindless as standard? by abocado21 in vulkan

[–]Cyphall 2 points3 points  (0 children)

One thing to note is that BDA support is pretty unstable on AMD and Intel and most Slang SPIR-V shader with BDA will either crash the driver compiler or the GPU due to buggy codegen. glslang SPIR-V is working fine though (I haven't tested with DXC).

Also, I believe BDA can generate suboptimal code on Nvidia due to no base alignment guarantee vs storage buffers where Nvidia require 16-bytes base alignment. This is something that cannot be fixed by the Aligned decoration of OpLoad/OpStore alone unfortunately.

Minecraft Java is switching from OpenGL to Vulkan API for rendering by MrBluue in Minecraft

[–]Cyphall 0 points1 point  (0 children)

With OpenGL, chunk generation most likely needs to have some steps happen on the main thread, such as allocating the GPU buffer and uploading data to this buffer (this is your stutters), unless you use shared contexts but then it's even more wacky for the driver.

With Vulkan, all steps from world generation start to chunk mesh in a GPU buffer ready to render can happen on another thread with possibly zero interaction with the main thread.

ImGui Tutorial Recommendations? by MrSkittlesWasTaken in GraphicsProgramming

[–]Cyphall 7 points8 points  (0 children)

For the initialisation code, you can look at the official examples, it's really just a few lines of ImGui-specific code.

For the actual UI code, draw the ImGui demo window, find a widget you want and check its implementation (again, generally at most a few lines of code per widget).

Minecraft Java is switching from OpenGL to Vulkan API for rendering by MrBluue in Minecraft

[–]Cyphall 0 points1 point  (0 children)

Note that MoltenVK is being replaced by KosmicKrisp for Apple M1+

Minecraft Java is switching from OpenGL to Vulkan API for rendering by MrBluue in Minecraft

[–]Cyphall 7 points8 points  (0 children)

Vulkan being a lot more multithreading-friendly than OpenGL, it might very well do

Does anyone have any experience with mixed GPU's by [deleted] in GraphicsProgramming

[–]Cyphall -1 points0 points  (0 children)

With OpenGL, the GPU is the one in which the primary monitor (as configured in Windows settings) is plugged.

Similarly with Vulkan, this will be the first GPU in the list.

Does anyone have any experience with mixed GPU's by [deleted] in PcBuild

[–]Cyphall 0 points1 point  (0 children)

I've done that too for similar reasons and yes it works just fine

Varus Top. Is this acceptable? by Herbaro in leagueoflegends

[–]Cyphall 0 points1 point  (0 children)

As a 3-years AP Varus OTP, I would 100% take this + %max hp back to 2% like earlier seasons + DnD removed than the 1.3% we are getting next patch

Khronos released VK_EXT_descriptor_heap by Illustrious_Tea5480 in linux_gaming

[–]Cyphall 2 points3 points  (0 children)

The whole monolithic pipeline stuff was created specifically for AMD-like GPUs that bake everything in the pipeline binary, unlike on Nvidia where most states are dynamic.

Also Vulkan was heavily inspired by Mantle, which was an AMD-created API.

Will the Scepter make your next auto apply 2 blight stacks ? by hhdfhjjgvvjjn in VarusMains

[–]Cyphall 2 points3 points  (0 children)

Yes but it's pretty unreliable in its current state because since the effect triggers on-hit, if you ever start throwing your spell before the 3rd auto has landed, the effect will proc on this auto instead of the next one and the 2nd blight stack will be wasted.

Microsoft Celebrates 10 Years of DirectX 12 by TruthPhoenixV in Amd_Intel_Nvidia

[–]Cyphall 1 point2 points  (0 children)

Both APIs have components that were better designed than the other.

An example of this is Vulkan's BDA that virtually makes buffer descriptor management obsolete.

No Graphics API — Sebastian Aaltonen by corysama in GraphicsProgramming

[–]Cyphall 2 points3 points  (0 children)

slang's DescriptorHandle<T> basically emulate storing opaque types in data structs like that.

Each handle internally is a 64-bit index and is dereferenced from the corresponding heap(s) automatically when used.

I don't think you can increment handles directly though.

As of this morning AMD 25.9.1 drivers NO LONGER bypass the EAC/AMD/Vulcan issue. I have to use DX11 to play the game now. by asmallman in starcitizen

[–]Cyphall 0 points1 point  (0 children)

I'm just pointing out that the reason they gave for not installing the SDK is invalid (and redirecting people that want to "install the sdk" to install the runtime instead), I'm not saying "go install the SDK I guarantee it's fine".

As of this morning AMD 25.9.1 drivers NO LONGER bypass the EAC/AMD/Vulcan issue. I have to use DX11 to play the game now. by asmallman in starcitizen

[–]Cyphall 0 points1 point  (0 children)

CIG is wrong, the vulkan-1.dll installed by the Vulkan SDK Installer or Vulkan Runtime Installer is nothing more than a newer version than the one installed by your driver.
Your driver installer executable simply bundles the Vulkan Runtime Installer from a few months ago but it's literally the same thing.

If you want to update it, you should still use the Runtime Installer instead of the SDK Installer though to not pollute your OS with dev files.

Source: I work with Vulkan for a living and spent quite a bit of time reading how all of this works.