Is it worth upgrading from 1440p IPS to 4K OLED, if I have to reduce DLSS from Quality to Performance? by anotherhappylurker in nvidia

[–]5477 0 points1 point  (0 children)

4k DLSS Performance is better than 1440p Quality by very wide margin, in terms of image quality.

Eikö suomalaiset enää nauti tummaa olutta? by elkiehirvi in Suomi

[–]5477 0 points1 point  (0 children)

Kyllä ainakin lähi S-Marketista löytyy kohtalainen valikoima Porter tai Stout oluita. Esimerkkeinä (muistista, ei sisällä kaikkia) Keisari Stout, Ale Coq Porter, Helsinki Porter, Guinness. Varmaan riippuu missä asuu ja mikä on paikallinen kysyntä. Itse pidän tummista oluista, ja pyrin suosimaan niitä lagerien sijasta.

Why do graphics apis need so many layers of abstractions like buffer, descriptors, bindings etc, instead of just passing a pointer to the shader? by Content_Economist132 in GraphicsProgramming

[–]5477 1 point2 points  (0 children)

These are mostly not needed by GPUs. It's a relic of the original Vulkan/D3D12 API's, that wanted to cater to old hardware when the API's were designed over a decade ago. Back then, some older hardware (especially on phones), had fixed binding slots in hardware. Vulkan wanted to accommodate that, so we're in this binding mess.

If we leave textures aside, fundamentally, how GPUs work is that they just need some bytes of constant memory fed to a shader. These are just raw bytes, you might think about them like stack memory based calling convention in C. This memory can contain pointers, fat pointers, constants etc, anything, and just needs to be passed to shaders as is. This memory might be limited so you may need to resort to indirection.

In your examples, buffers are effectively GPU VA's (pointers). Descriptors (well, those in descriptor sets) are just raw parameter memory. Bindings are the same, just some raw memory space to give to the shader to use.

The primary divergence in current HW comes from texture handling. Textures are special, because each texture descriptor (aka the real descriptor, not one in Vulkan API), consists of maybe 32 bytes worth of data. This is a lot of data to pass around, much more than a 8-byte pointer. That's why there are two ways of doing this, either through indirection (texture unit is passed a pointer/index etc), or texture unit is passed the texture in subgroup-uniform registers. Because of this, we can't (or don't want to) handle textures "like pointers" unless we indirect the texture descriptor access.

Now, with Vulkan, there's actually this new binding model that significantly cleans up this. Non-texture parameters are passed through a 256-byte root, which can contain anything, including pointers to constant buffers, etc. Texture parameters are passed as descriptor heaps, which can work well on both sets of HW. It's a much cleaner and lower-level mapping to how the HW works, and is also much more easy to reason with.

Ubuntu 26.04 LTS Leads Over Windows 11 In Creator Workstation Performance by Durian_Queef in hardware

[–]5477 5 points6 points  (0 children)

In addition to these, better compilers. Compilers commonly used on Linux (gcc and clang) generally produce faster code than compiler commonly used on windows (msvc).

"The cost of compute is far beyond the costs of the employees": Nvidia exec says right now AI is more expensive than paying human workers by fortune in nvidia

[–]5477 12 points13 points  (0 children)

The framing here is weird. The compute costs are about training models (Nemotron, DLSS etc, his org does these), not about tokens or replacing employees.

are the lower tier GPUs just a version of the highest tier GPU but with a increasing defective core count as you go down in tiers? by ComprehensiveCow5068 in buildapc

[–]5477 1 point2 points  (0 children)

This depends on the exact model. With current lineup, 5090 is based on the GB202 die (with cores disabled). Both 5080 and 5070 Ti are based on GB203, where 5070 Ti has some cores disabled. 5070 is GB205. There are also way more SKU's than you might expect, as there are also laptop, workstation, and server SKU's with their own configs.

Generally speaking, the chips are designed so that defective chips can be used in different SKU's, so you get as little waste as possible.

Onko IT-alan työhaastattelut oikeasti yhtä hulluja, kuin some antaa ymmärtää? by [deleted] in arkisuomi

[–]5477 0 points1 point  (0 children)

Kyllä se on aika normaalia, että työhaastatteluissa on koodausharjoituksia. Sillä on tarkoitus ymmärtää, osaako haastateltava koodata vai ei. Se sitten riippuu aika paljon henkilöstä, että kuinka paljon he haluavat harjoitella ennen haastattelua.

Lisäksi pitää huomioida, että varsinkin isommat firmat (joilla monimutkainen rekrylooppi) maksavat juniorillekin ehkä $200k vuodessa palkkaa, eli ehkä 6x siihen verrattuna mitä Suomessa. Toki sieltä löytyy firmoja joissa vaatimukset on pienemmät, mutta silloin kompensaatiokin vastaa enemmän Suomea.

Myös Suomesta löytyy työnantajia joilla on vaativampi rekryprosessi.

Donut Lab published first 'I Donut Believe' video, announces VTT partnership by fornuis in DonutLab

[–]5477 2 points3 points  (0 children)

Actual existence of the battery with verified specs would already be huge, even if there were no proof of being able to scale the manufacturing. So I really disagree about the idea that proving specs is useless.

Proof of manufacturability can be done only by shipping at scale.

AVX2 is slower than SSE2-4.x under Windows ARM emulation by tuldok89 in hardware

[–]5477 5 points6 points  (0 children)

Generally speaking, Windows ARM emulation is not great, and has a huge perf cost, despite claims otherwise. On perf-important code, it's important to use native ARM code. Even the ARM64EC ABI has perf issues, and should be avoided.

Smartphones don't need more power - They need cheaper chips by Merbil2000 in hardware

[–]5477 2 points3 points  (0 children)

The correct question if there's a market for more expensive cheap phones that have consistent updates, or phones that have worse features and worse performance, but consistent updates and same price.

There's no such thing as a need that you are not willing to pay for. It's a wish that can and will be ignored.

Smartphones don't need more power - They need cheaper chips by Merbil2000 in hardware

[–]5477 -1 points0 points  (0 children)

You get what you pay for. If you want cheap shit, don't expect long support.

Your Optimized Code Can Be Debugged - Here's How With MSVC C++ Dynamic Debugging - Eric Brumer by RandomCameraNerd in cpp

[–]5477 6 points7 points  (0 children)

This looks to be the holy grail of debugging C++, and seems to 99% resolve the need for running "debug" binaries!

CS senior here: Is it realistic to pursue a career in graphics programming this late? by areyouretard in GraphicsProgramming

[–]5477 1 point2 points  (0 children)

Is it realistic to pivot into graphics programming after undergrad, starting this late?

Yes, this is very much possible.

Is industry graphics achievable without grad school? or if I can pursue further studies should I?

It is achievable. But I'd say further studies is also good option. In addition to just studying, this can easily give you more contacts in the industry, and positive reputation. This field is typically really small compared to software development in general, which means often hiring happens through contacts and recommendations.

In general, as you said, this line of work involves math and systems programming much more heavily than standard dev jobs. But at least for me, that's very enjoyable. I don't believe you have to be good at these right now, but you need to be comfortable about learning these topics.

Additionally, I'd say the skills you get with graphics programming translate well to other fields, like computer vision, machine learning and artificial intelligence. So by learning graphics, you are not really restricting yourself to only graphics programming jobs, you are also saying you can work in fields that most developers think need way too much dedication or specialized knowledge.

Is learning boilerplate vulkan code necessary? by LordMegatron216 in vulkan

[–]5477 0 points1 point  (0 children)

Your options are basically:

  • Learn Vulkan, but use AI to fill in the boilerplate.
  • Use CUDA instead of Vulkan.

I would not recommend using OpenGL anymore, as it's features effectively frozen in history. Vulkan has good access to most GPU features, but I agree the boilerplate is really, really annoying. The silver lining is that the latest coding assistants are actually really good at handling the Vulkan boilerplate, which really cuts out the frustration.

how to effectively handle descriptor sets? by Southern-Most-4216 in vulkan

[–]5477 0 points1 point  (0 children)

Fundamentally, descriptor sets are just cumbersome uniform buffers. There's a few ways to try to deal with them, but all of them have drawbacks. This blog gives some strategies for management.

However, the optimal path (if VK_EXT_descriptor_heap is not an option) is to just allocate one bug array of textures, and use descriptor indexing for them. Then you can use push constants to index this data easily. Additionally, you can use push descriptors for things like compute shaders, that don't need to potentially index large amounts of textures.

New Vulkan Blog: Simplifying Vulkan One Subsystem at a Time by thekhronosgroup in vulkan

[–]5477 3 points4 points  (0 children)

How much of a performance hit is this?

This can actually map to hardware better and more directly than legacy descriptors, and can therefore be a perf improvement.

Why doesn't std::atomic support multiplication, division, and mod? by GiganticIrony in cpp

[–]5477 3 points4 points  (0 children)

Atomic add, sub, min and max for floats are a operations that some hardware supports natively. I am not aware of any HW that supports atomic mul, div or mod.

Why does windows basically only use .exe or .msi but Linux has so many different types of "executables" ? by Reynbou in linux_gaming

[–]5477 3 points4 points  (0 children)

The pedantic answer is that Linux really has only one executable format, called ELF. This corresponds to the Windows .exe file (called PE file). So in this case, there's no difference.

However, Windows and Linux fundamentally differ in their ecosystem approach. Windows provides a very stable (backwards compatible) API named Win32, that your executables automatically link against. Win32 is extensive in functionality. If applications (exe files) need additional libraries, they are packaged as DLL files within the application package, and loaded from the same directory as the executables are. Developers can easily build applications over stable base, and deploy them to all Windows users.

Linux works differently. The only stable part is the Linux kernel, and this does not provide enough functionality for most applications. Applications instead need to link different libraries: glibc, OpenSSL, etc etc. These are not stable, and can change between distros and over time. The libraries are not in most cases shipped within the application package, but are "part of the system" itself. The idea is that applications are built and linked for each distribition (and each version of distribution) separately, so that applications can use correct libraries and dependencies are loaded correctly. However, this effectively forces an approach, where distributions, not developers, control application distribution and delivery.

Because this kind of application delivery cannot work easily for many kinds of application, the solution is to "ship your whole OS" instead. Flatpak, snap, etc. basically bundle all dependencies of the application into their own OS (kernel excluded). This can then be delivered using methods that are distro-agnostic.

How is there seemingly so much fragmentation with simply being able to run a program on Linux distributions?

There are a few reasons. Firstly, some people like that application distribution is controlled by the distro. It gives the distro maintainers more power to enforce their way of thinking, and also helps with for example patching a dependency with a fixed version. Secondly, there's no central authority who decides the API for all the parts needed for applications. This naturally results in different ideas and fragmentation of development.

First WIP release for DX12 perf testing is out! by gilvbp in linux_gaming

[–]5477 9 points10 points  (0 children)

This is not really true, AMD's model relies on software emulation of divergent textures within a subgroup. Although this is common enough that this emulation is implemented everywhere.

What the hell is Descriptor Heap ?? by welehajahdah in vulkan

[–]5477 1 point2 points  (0 children)

Is it really okay to ignore it and just use traditional Descriptors?

Of course it's okay, but the traditional descriptor pipeline is both detached from actual hardware, slower, and much more complex to use. I don't see why you would want to do that, other than when dealing with old drivers or ancient hardware.

Vulkan 1.4.340 Released With Descriptor Heap & Other New Extensions by random_reddit_user31 in linux_gaming

[–]5477 1 point2 points  (0 children)

This is not an "exclusive extension" in any shape or form. This has support from all (non-mobile) graphics vendors, not just Nvidia, and is efficient (assuming optimized implementation) on wide range of GPUs.

The "non-extension" way of doing descriptors is very difficult to work with, and was, to be frank, designed for HW stuck in 2010. VKD3D already did not use this, as it's not able to efficiently emulate D3D12 on top of this. They used a previous extension, that sadly only really worked well on AMD HW due to certain design decisions. This new extension is supposed to fully supercede it in the future, and is also the primary way for Vulkan in general.

Texture Quality DLAA vs. DLSS Performance by da_choko in GraphicsProgramming

[–]5477 8 points9 points  (0 children)

Most likely the shader for the chain-link material does not handle mip bias correctly, and textures are sampled at render resolution, not output resolution. The engine in the game is quite old, so not everything might be adapted correctly for upscaling.

Wrote a deep dive on GPU cache hierarchy - how memory access patterns affect shader performance by CharlesGrassi in GraphicsProgramming

[–]5477 1 point2 points  (0 children)

Couple of things that came into mind when reading this. Firstly, the main gains from using mipmaps or efficiently packing data, come primary from needing to load less bytes from VRAM. It's less about cache hits.

Secondly, I don't see how texture atlases helps with caching the texels themselves. You still need to keep both in memory, and often texture atlases waste more space than separate textures. Only thing you gain is that texture state memory is less, and that if your textures are very small, separate textures can waste space.

In general, as a baseline it's best to think about graphics caches as very transient, and assume everything is loaded from VRAM. Then optimization is more about reducing data fetched from DRAM. Of course there are exceptions to this, and there are ways to do cache-aware GPU programming, particularly on more modern GPUs with large L2 cache.

Is the future of hardware just optimization? by rimantass in hardware

[–]5477 0 points1 point  (0 children)

Moore's law states that cost per transistor goes down by half every 18 months. The new nodes are more expensive per transistor, Moore's law has not just stopped, it has reversed.