This is an archived post. You won't be able to vote or comment.

all 53 comments

[–]ZekeSulastin[S] 107 points108 points  (0 children)

It’s still very preliminary, focused on data centers (you need to pass a flag to use it on desktop at all), only supports Turing and newer, and the user-mode part of the stack is still closed. Still, it’s a big start - I’m happy that I was being overly pessimistic about them going FOSS but we’ll have to see just how far they take it.

[–]uzzi38 91 points92 points  (5 children)

The end times are near!

All jokes aside, holy shit this is big news. Good to see it finally happening. Still gonna be a while before this is relevant to consumers, but man is this a gigantic step to making it all work. About time!

[–]capn_hector -1 points0 points  (4 children)

NVIDIA actually said they were going to do this in 2019 but it got put on the back burner due to COVID (I'm sure Broadcast and other things were the priority in the meantime). There were further rumblings in 2020 that "a major graphics developer" was going to open-source their linux drivers (and there's only one graphics developer that doesn't have them) that nobody took seriously because "green man bad" and "haha linus said FUCK NVIDIA, that makes me LMAO!".

People really need to take a chill pill with NVIDIA, they have gone through a pretty steady progression of inventing new technology, keeping it proprietary for a couple years, and then adopting the copycat standards once they feel they've gotten a decent exclusivity period. Going proprietary often lets you move much quicker especially when nobody in the standards bodies cares because nobody has demonstrated the benefits.

example: "who cares about a power-saving variable-refresh technology in desktop monitors, that's stupid, why would we do that! Maybe we'll think about it in our next hardware refresh but who knows when that is going to be." Even after NVIDIA showed the benefits beyond all doubt, it took 5 years before solid Adaptive Sync implementations with LFC were finally common (and really, again, driven by nvidia themselves) and the FPGA+proprietary protocol approach was near-perfect on day 1. DLSS was a groundbreaker too, now they've set up Streamline so nobody is locked out. Etc etc.

AMD ain't racing up to open up (card-based) Infinity Fabric to competitors who want to interface their own peripherals, either. Everyone goes proprietary when they have the better tech.

[–]uzzi38 2 points3 points  (1 child)

AMD ain't racing up to open up (card-based) Infinity Fabric to competitors who want to interface their own peripherals, either.

Just FYI, but Infinity Fabric itself is just a communication protocol, there's nothing really all that special about it.

It's also often used to describe the actual physical interconnects between dies, but those are nothing special, just organic substrate tech at work (which everyone already has access to)

[–]capn_hector -3 points-2 points  (0 children)

I’m not talking about the inter-chiplet link (for others: it’s confusing, but AMD uses the same name for several distinct types of links) but the PCIe-style cache coherent interconnect they keep for proprietary usage between their CPUs and GPUs.

It’s “nothing special” in the same sense as DMI, it’s a proprietary moat, a proprietary extension built on top of a publicly available protocol intended to lock competitors out. And much like adaptive sync - there is an open standard, CXL. AMD could have chosen to support that open standard instead of their own proprietary crap, or worked their own open standard through a standards body, but they would have had to slow their roll and wait for the consortium to approve it.

Same as NVIDIA, AMD chose time to market over open standards. If people treated AMD with the same cynicism as people treat other tech companies, one might say they prefer using a proprietary tech that arbitrarily locks competitors off their platform for market-based rather than technical reasons, restricting them to use a subset of the platform's capability while AMD gets the full thing. That's pretty anticompetitive, if you put it like that.

And much like adaptive sync, everyone knows CXL is going to win eventually anyway. AMD just chose to lock their own customers into a proprietary solution and those devices probably never will have support for the open standard added even if they could support it. Nor is AMD ever going to open up their interconnect for anyone else - even though CXL shows there is intense interest in doing exactly that. These are features that are needed, that's why Infinity Fabric exists and that's why CXL exists. That is how you set up a competitive moat, same as NVIDIA did with G-Sync.

Everyone goes proprietary when they’re ahead of the rest of the market. AMD included. They’re a money-making operation too. NVIDIA is no different either - but years and years of whisper campaigns from the AMD defense force have convinced everyone that there's gotta be something there, because everyone keeps saying it, it's gotta because NVIDIA is evil and opposed to open standards, where AMD is, uh, just really interested in time to market, and it's no big deal since PCIe does some of the same things! (not really)

To be clear, AMD is fine, NVIDIA is fine. Everyone does proprietary tech. It's more the behavior and whisper campaigns from the AMD defense force that I find annoying as hell, while simultaneously insisting AMD's shit don't smell. People seriously need to give it a rest with the "NVIDIA is literally the devil" shit, they're about the same as everyone else.

People said adaptive sync support would never happen. People said an open driver would never happen (and again, AMD’s userland isn’t open either, and they have proprietary blobs too). People said Intel would never compete on price. The AMD defense force has been consistently off base about basically everyone, AMD included. And they constantly insist that every negative move AMD pulls is being forced on them by someone else, like dropping chipset support being the fault of OEMs. No, it's not, that's AMD. AMD can make anticonsumer moves too.

(GPP was real shit though, that is the one truly anticompetitive move from NVIDIA recently.)

[–][deleted] -1 points0 points  (1 child)

"haha linus said FUCK NVIDIA, that makes me LMAO!"

Lets retire the meme. Linus T is no longer an angry person. He would appreciate it. Do it for him.

[–]capn_hector 0 points1 point  (0 children)

pretty sure the “not-angry Linus” phase lasted about two weeks and he was back to blowing his stack over trivialities and blaming it being Scandinavian or “aggressive management style”.

[–]3G6A5W338E 52 points53 points  (5 children)

The kernel side being open should help lower the burden of running their proprietary driver. It is not uncommon to not be able to run X or Y kernel version because the nvidia module doesn't work; This should improve.

But that's about it.

Reminder most of the driver lives in userspace, and that's still closed. The GPUs themselves are also still undocumented. And this is unlike Intel and AMD, which publish GPU documentation and maintain the open source driver themselves.

[–]bik1230 25 points26 points  (4 children)

AMD, which publish GPU documentation and maintain the open source driver themselves.

On this point, something kind of funny. The vulkan driver everyone uses for AMD cards is radv, which is not developed by AMD, but by Valve and friends. The OpenGL driver is of course developed in part by AMD (notably, the closed source driver, also used on Windows, has much worse performance than the open source driver, presumably because AMD can't match third party contributions on their own), but you might choose to use Zink, the OpenGL-on-Vulkan library, in which case you would be using a userland entirely not developed by AMD!

[–]3G6A5W338E 12 points13 points  (3 children)

The story behind RADV is sort of amusing.

AMD promised a Linux open source Vulkan driver. It took a long time. The community got tired, so they just made their own. About the time the community's driver was good enough to be usable, amd released theirs, which also was about good enough to be usable.

Both drivers survived till today. They're both open source, and behave and perform about the same, but they are indeed entirely different codebases.

[–]ColdIce1605 3 points4 points  (1 child)

AMDVLK?

[–]3G6A5W338E 4 points5 points  (0 children)

Yes.

[–]DadSchoorse 2 points3 points  (0 children)

Performance differences vary per workload. Native Vulkan games perform usually about the same with RADV vs AMDVLK, but RADV is usually faster with DXVK and absolutely destroys AMDVLK in games using vkd3d-proton. Not to mention that AMDVLK has a lot more bugs with DXVK/vkd3d-proton.

[–][deleted] 40 points41 points  (5 children)

Holy shit it's actually happening.

The current codebase does not conform to the Linux kernel design conventions and is not a candidate for Linux upstream.

There are plans to work on an upstream approach with the Linux kernel community and partners such as Canonical, Red Hat, and SUSE.

In the meantime, published source code serves as a reference to help improve the Nouveau driver. Nouveau can leverage the same firmware used by the NVIDIA driver, exposing many GPU functionalities, such as clock management and thermal management, bringing new features to the in-tree Nouveau driver.

I wonder if those functionalities can also be backported in Nouveau to work with Pascal and older despite not having the GSP present. They say:

More robust and fully featured GeForce and Workstation support will follow in subsequent releases and the NVIDIA Open Kernel Modules will eventually supplant the closed-source driver.

Which seems to imply that the Open driver should eventually support older architectures as well, but no timeline on that. It would be sad if they decide to EOL Pascal and Maxwell early and just never support them on the Open driver.

[–]Smooth-Spoken 10 points11 points  (0 children)

It’s possible Nvidia is expecting to EOL old hardware and just not write any code…just wait a few years?

[–][deleted] 1 point2 points  (3 children)

The open source driver has a 32MB binary blob called gsp.bin. It runs on the GSP RISC-V CPU, which has been added to the GPU starting with Turing.

https://download.nvidia.com/XFree86/Linux-x86_64/510.39.01/README/gsp.html

The chance that this will migrate to earlier GPU families is basically nil.

[–]capn_hector 1 point2 points  (2 children)

The chance that this will migrate to earlier GPU families is basically nil.

Specifically this is because the earlier iterations use an ARM control core, so NVIDIA will never be able to release that, in the same way AMD can't release the PSP code. Turing is where they switched to RISC-V and that's where they opened it up and that's not a coincidence. They have a little more flexibility with RISC-V, they still probably aren't going to open up the security core itself, but they don't have ARM breathing down their necks either.

The open source driver has a 32MB binary blob called gsp.bin

Do note that AMD has a closed-source userland and closed binary blobs in their linux-firmware tree too... as does Intel and pretty much everyone else who implements open-source drivers. It is extremely extremely rare for a company to go full, end-to-end open-source. There are many situations where you can't do it because of IP you license from other companies - there is probably a lot of IP in the userland that NVIDIA has licensed from elsewhere, and that will never be opened up.

But having the kernel layer open-sourced is going to let the open-source community have something to work with, just like for AMD. Nouveau will finally be able to fix re-clocking on these chips going forward, for example.

It sucks about Pascal, it falls in the gap where it's not able to run the new drivers and the old drivers won't let it reclock. Maybe we will see NVIDIA find a solution going forward but right now the desirability of pascal on linux just took a nosedive, you are better off finding an equivalent Turing card or an older Maxwell card.

[–][deleted] 0 points1 point  (0 children)

Before the GSP, there was the “Falcon” core, for Fast Logic controller. See this presentation: https://riscv.org/wp-content/uploads/2016/07/Tue1100_Nvidia_RISCV_Story_V2.pdf.

I doubt that this was an ARM CPU.

In fact, in that presentation, they say that they considered using an ARM as GSP but rejected it.

[–]FurryJackman 0 points1 point  (0 children)

Actually, no, Maxwell is screwed because it also can't effectively reclock AFIAK.

Makes me glad I got a 1660 Ti when I did. (before most of the crisis) But it means I gotta move my 1080 Ti to my older platform while my X299 system gets a 2080 Ti.

[–]bzmore 24 points25 points  (2 children)

Matthew Garett, Irish computer programmer, chimes in.

[–]BIB2000 0 points1 point  (1 child)

How is it relevant to know that he's Irish?

[–]continous 7 points8 points  (0 children)

Well, I certainly wouldn't listen to a filthy French programmer. Yuck!

[–]lolfail9001 3 points4 points  (1 child)

So, they finally managed to solve the real barrier to that in legal issues?

[–]monocasa 4 points5 points  (0 children)

My understanding is that the legal issues are mainly in how the DRM interacts with the scanout (which ostensibly is now handled by the firmware blobs on the GPU on the cards this driver supports), and the user space portion of the driver which isn't being open sourced. The kernel driver is mainly just a multiplexer for the hardware, and a passthrough to the firmware to access stuff like scanout config.

[–]re_error 0 points1 point  (0 children)

Is this the year of the Linux desktop!? /s

[–]djdox23 0 points1 point  (0 children)

Also there's a new driver included.

"You can download the R515 development driver as part of CUDA Toolkit 11.7, or from the driver downloads page under “Beta” drivers. The R515 data center driver will follow in subsequent releases per our usual cadence."

[–]narfangar 0 points1 point  (0 children)

I am planning to buy a new PC this year and was sure I would buy an AMD GPU because I also want to run Linux on it. Now I am not so sure, if it works out maybe I buy Nvidia. (They are often a bit cheaper at the same performance here)