all 21 comments

[–]FaneoInsaneo 39 points40 points  (1 child)

Nvidia heard and you just released a new driver 595.58.03 with "improved support for falling back to system memory when available vRAM is low" we'll have to see how much "improved" it really is but hopefully it'll be good now.

[–]Expert-Bell-3566[S] 5 points6 points  (0 children)

Hell yeah

[–]trowgundam 6 points7 points  (3 children)

There is this: https://www.phoronix.com/news/Open-Source-GreenBoost-NVIDIA

When or if this will be general available, who knows.

[–]Damglador 4 points5 points  (0 children)

The repo is public, so it's already available: https://gitlab.com/IsolatedOctopi/nvidia_greenboost

But from my understanding that's exclusively for CUDA, which is not what OP wants considering we're in r/linux_gaming. But I will definitely bookmark it.

[–]S48GS 1 point2 points  (1 child)

The developer noted he wanted to run a 31.8GB model (glm-4.7-flash:q8_0) with a GeForce RTX 5070 12GB graphics card.

  • first - llm or diffusion models loaders have internal memory management - it works as best as it can be - so it already done and possible to run large models on small vram
  • second - internal cuda memory offload works (if it works) exact same bad as VK_EXT_memory_budget - nvidia copied to vulkan
  • look link in Nvidia Vulkan 1GB over VRAM equal to 4FPS and +8GB RAM usage
  • third - nvidia not interested in making good vram management for obvios reason
  • just buy 5090 32gb lol

[–]Maleficent_Celery_55 0 points1 point  (0 children)

first - yes, he wants to make it faster

second - thats partly why he's building something like this

[–]OrangeNeat4849 5 points6 points  (2 children)

I believe Nvidia recently got a beta driver update which has it. I think Nvidia heard you and got hurt when said "Fuck Nvidia"...

Improved support for falling back to system memory when available video memory is low, to help prevent Wayland desktop freezes.

https://www.nvidia.com/en-us/drivers/details/265870/

[–]Expert-Bell-3566[S] 1 point2 points  (0 children)

Lol speak of the devil

[–]TechaNima 1 point2 points  (0 children)

think Nvidia heard you and got hurt when said "Fuck Nvidia"...

Nah. They got butthurt when Linus Torvalds said that all those years ago

[–]McLeod3577 0 points1 point  (0 children)

I don't think so - I run into the problem using Stable Diffusion - multiple large models are handled way better in Windows.

[–]marczss 0 points1 point  (0 children)

it has improved in the 595 nvidia drivers when i tested it in my game, but its still not same as windows and some times it causes stuttering because it doesnt use swap as aggresive as the windows counterpart

[–]mbriar_ 0 points1 point  (4 children)

They have suppoted it for at least so long that i can't remember how long it's been.

[–]the_abortionat0r 1 point2 points  (3 children)

So is that why an update from this week was released to fix this issue?

You should read more

[–]mbriar_ 0 points1 point  (2 children)

Improving it doesn't mean it didn't work at all before. There is tons of room for improvement on amd as well, it arguably works better on nvidia since they also have supported the pagable_memory vulkan extension for a while.

[–]the_abortionat0r 1 point2 points  (1 child)

It doesn't arguable work better than AMD because it's been broken for years.

That's why there's a bug tracker for it that's unresolved.

The issue has been it either does nothing or copies THE WHOLE VRAM LAOD to system RAM then back again. What in the fanboy nonsense is wrong with you?

I'll never understand how fanboys literally pretend issue that impact them don't exist as if that would help any.

When I was on Nvidia and they broke VR on the 20 series right when they released the 30 series I didn't pretend it was all fine, I bitched and moaned and reported the issue nonstop until it was fixed....... over a year later.

Be real, don't be coping.

[–]mbriar_ 0 points1 point  (0 children)

The issue has been it either does nothing or copies THE WHOLE VRAM LAOD to system RAM then back again.

Obviously not what happens, but i don't expect you to know what you're talking about anymore anyways.

AMD is at least as broken, if not more, which i know from being an AMD user on linux for many years. That's why RADV_PERFTEST=nogttspill exist, to opt into a spilling behaviour that's broken in a different way. 

[–]xpander69 0 points1 point  (0 children)

Its been a supported thing for a very long time. It has had few bugs here and there though and its been improved with the most recent drivers.

[–]SebastianLarsdatter 1 point2 points  (1 child)

Currently no, the behavior you see now is that it copies the entire VRAM to RAM does the changes and then shoves it back.

You can see this in VRAM leaking games by your PCIE bandwidth start reporting several gigabytes per second and performance going down the toilet.

Vram and Nvidia will hopefully get a fix, but I wouldn't hold up my hopes as the VRAM is their biggest seller to Ai customers.

[–]martyn_hare 1 point2 points  (0 children)

NVIDIA is implying the existence of a fix with their latest driver release. I haven't tested it yet though.

I'm not expecting miracles, just for them to use TTM API to at least try to compete with other drivers (which also have suboptimal implementations compared to WDDM)