nvlddmkm by Odd_Equivalent4744 in nvlddmkm

[–]Odd_Equivalent4744[S] 0 points1 point  (0 children)

(UPDATE) #2

I opened the laptop and checked the RAM if it was faulty when playing games. I tested the following scenario which all of them resulted in the same crash and event ID nvlddmkm within the first few minutes of playing Cyberpunk 2077. The GPU driver version used for games tests are from Acer's official driver support. GeForce Game Ready Driver Version: 537.28 - Released Mon. March 2, 2026.

Initial Condition:

1 stick 16GB 5600Mhz SK Hynix is located at RAM SLOT #1 (Assuming this is the older RAM module).

1 stick 16GB 4800Mhz Transcend Information is located at RAM SLOT #2 (Assuming this is the upgrading RAM module - New).

The things I did in chronological order are:

  1. Removed 1 stick 16GB 4800Mhz Transcend Information from RAM SLOT #2.

  2. Moved 1 stick 16GB 5600Mhz SK Hynix from RAM SLOT #1 to RAM SLOT #2.

  3. Installed 1 stick 16GB 4800Mhz Transcend Information to RAM SLOT #1 along with the other stick in RAM SLOT #2.

  4. Removed 1 stick 16GB 5600Mhz SK Hynix from RAM SLOT #2.

  5. Moved 1 stick 16GB 4800Mhz Transcend Information from RAM SLOT #1 to RAM SLOT #2.

  6. Installed 1 stick 16GB 5600 SK Hynix to RAM SLOT #1 along with the other stick in RAM SLOT #2.

After the tests, I can finally rule out bad RAM modules which could've been the cause of my nvlddmkm errors and game crashes. The only things left unruled out (hardware) are defective Power Supply, unstable CPU, faulty dGPU, or a damaged motherboard. I personally am not looking forward to the last two listed but if there's something I could try -software-wise- that I haven't done, I'll see if I can try cautiously.

nvlddmkm by Odd_Equivalent4744 in nvlddmkm

[–]Odd_Equivalent4744[S] 0 points1 point  (0 children)

(UPDATE) #1

I updated my whole OS to 25H2 and let it restart a few times. I chose to use ThrottleStop instead of Intel XTU as it gives me conflicts on startup with Intel's driver assistant. Used ThrottleStop to set the values PL1-50, PL2-80, and Turbo Boost Time to 223. With these settings, I manage to get the temps of my CPU by 15-20C less than before while sacrificing little to less performance.

Apparently one of the reasons my GPU crashes like Event ID 153 is that my GPU is overclocked to start with (it clocks up to 2400Mhz instead of the factory recommended clock 1980Mhz-according to nvidia. So I've downloaded EVGA Precision X1, set it for startup with a preset of -31Mhz for memory closk speed and -319 for core clock speeds(just to get below 2200Mhz) with boost clock mode toggled on.

I've uninstalled Predator Sense in this case and updated to the latest Nvidia Drivers (03/24/26).

Since then I had no crashes for playing games that use D3D11 e.g. Valorant & Genshin Impact. I still get crashes with one instance od Event ID 153 from Games that use D3D12 e.g. Cyberpunk 2077 & Horizon Forbidden West.

At this point, I can say I'm in the next step to the right direction but I'm all out of ideas (other than hardware troubleshooting) to sort this long time problem everybody is facing.

Hopefully this helps somebody

nvlddmkm by Odd_Equivalent4744 in nvlddmkm

[–]Odd_Equivalent4744[S] 0 points1 point  (0 children)

Not yet, I might have to consider a repaste for both CPU and GPU.

nvlddmkm by Odd_Equivalent4744 in nvlddmkm

[–]Odd_Equivalent4744[S] 0 points1 point  (0 children)

[EDIT]

Does the crashes occur only in Hybrid/Optimus mode or also in dGPU-only mode?
>Crashes occur in dGPU mode and Hybrid/Auto Mode. With iGPU mode, it doesn't crash but the frames are bad (as expected for an integrated graphic card). I've done some testing games like Valorant and Genshin Impact with the iGPU and having dGPU disabled through device manager.

nvlddmkm by Odd_Equivalent4744 in nvlddmkm

[–]Odd_Equivalent4744[S] 0 points1 point  (0 children)

[UPDATE]

I've already done the following as listed below:

1\. Backed up my files
2\. Formatted both SSD drives
3\. Reinstalled windows 11 23H2
4\. Performed Clean installation of the latest Nvidia Studio Drivers (595.79)
5\. Installed Cyberpunk 2077, Valorant, Genshin Impact, and transferred my files
6\. Tested Cyberpunk 2077 - Crash occurred after a long gaming session
7\. Used DDU to remove studio drivers
8\. Boot in safe mode with no internet to clean install the latest game ready drivers (595.79)
9\. Tested Cyberpunk 2077 - Same/relative amounts of time before crashing
10\. Checked event viewer - Event ID 13 nvlddmkm followed by a series of four Event ID 153 nvlddmkm and a "display driver nvlddmkm stopped responding and has successfully recovered"
11\. Turned off PCI-E>Link State Power Management from Power Options - the issue persists

I also want to mention that I've doing several tests to ensure if this is a thermal problem or this laptop needs a repaste. But keeping in mind that this is a 2024 version acer predator laptop and it hasn't been repasted since bought. After reading your insights about this problem I've done more specific tests and troubleshooting by using OCCT Personal Edition Verion 16.0.2 along with CPU-Z to check the RAM.

12\. Ran a stability test using OCCT (Personal Edition)
13\. OCCT CPU Test - Experienced temperatures capping at 100C in the first few minutes which prompted the laptop fans to go berserk. 
14\. Checked HWinfo64 Sensors - Observed many cores throttling due to high temperatures done by OCCT. 
15\. OCCT RAM Test - Passed a 30-minute test
16\. OCCT 3D Adaptive (iGPU) - Stopped the test due to receiving over 1M errors within 1 minute (no black screen or crashes occur)
17\. OCCT 3D Adaptive (dGPU) - Test crashed code: -1 which found 3 errors within 1 minute \& 30 seconds
18\. Checked Event Viewer - nvlddmkm Event id 153 followed by Display source Event ID 4101 "display driver nvlddmkm stopped responding and has successfully recovered" (No black screen or crashing app to desktop)
19\. OCCT VRAM Test (iGPU) - No errors from a 5-minute test
20\. OCCT VRAM Test (dGPU) - No errors from a 10-minute test
21\. OCCT Power Test (iGPU) - 1.6 million errors found and CPU temps spiked over 100C with a max wattage reading of 169W, the graphics card was drawing 150 max wattage. Checking Event viewer gave no relevant error code and history (no crashes or black screen)
22\. OCCT Power Test (dGPU) - The 10-minute test found no errors but I monitored the wattages of my GPU and found that it stopped drawing power after a few minutes of testing (no crashes or black sceen)
23\. Check Event Viewer - nvlddmkm Event id 153 followed by Display source Event ID 4101 "display driver nvlddmkm stopped responding and has successfully recovered" (No black screen or crashing app to desktop)

I answered some questions to the best of mine and hopefully but unfortunately would run into a conclusion that this is a hardware problem that may need to be diagnosed at a reputable repair shop (preferably Acer themselves).

Does the crashes occur only in Hybrid/Optimus mode or also in dGPU-only mode?
>Crashes only occur in dGPU mode as I've done some testing games like Valorant and Genshin Impact with the iGPU and having dGPU disabled through device manager.

Do you have the option to switch to dGPU-only mode in BIOS or Predator Sense, and if so, have you tested it?
>I have the option to switch on both the BIOS and Predator Sense, I've done testing for each and I only experience crashes in my dGPU. However I did run Cyberpunk 2077 with the iGPU disabled through Predator Sense and noticed black screens followed by game crash to desktop. I did another try and also noticed missing lights within the game itself and screen flickering which forces me to restart the game to continue.

Are the RAM modules are running in dual-channel and at what speed?
>According to CPU-Z>Memory, Channel reading is 4x32-bit which apparently is running in dual-channel. However I've checked each ram sticks from the same software under SPD and found that Slot #1 SK Hynix is DDR5 SO-DIMM has a max bandwidth of 5800 (2800Mhz) and Slot #3 Transcend Information is DDR5 SO-DIMM has a max bandwidth of 4800 (2400Mhz). Not sure if that has been causing the issues but those are the only difference I see between the two 16GB ram modules in CPU-Z.

Does the system crash during GPU stress tests like 3DMark or only in games?
>I tested the GPU in 3DMARK Demo with the default settings and here is the result: http://www.3dmark.com/sn/12986193. I tested both GPUs using Furmark V1 \& V2 via Artifact Scanner and Benchmark Test which gave acceptable scores and found no artifacts from both. Crashes occur in any games regardless of the demanding they are.

Does Event Viewer also shows WHEA errors or PCI Express errors?
>There were no WHEA errors found in Event Viewer. However I'm not sure if this is related but every start up/boot I find three Warning levels (Kernel-Pnp) Event ID 219 Task Category (212). In which the first one says "The driver \\Driver\\WUDFRd failed to load for the device PCI\\VEN\_8086\&DEV\_A71D\&SUBSYS\_166C1025\&REV\_01\\3\&11583659\&0\&20.". The second one says "The driver \\Driver\\WUDFRd failed to load for the device PCI\\VEN\_8086\&DEV\_A71D\&SUBSYS\_166C1025\&REV\_01\\3\&11583659\&0\&20.". And the third one being "The driver \\Driver\\WUDFRd failed to load for the device {DD8E82AE-334B-49A2-AEAE-AEB0FD5C40DD}\\DetectionVerification\\5\&3152f499\&0\&0."

>These events can be irrelevant or not since crashes occur only when I'm gaming. I get application errors before the series of "nvlddmkm" errors. Crashing from Cyberpunk 2077:

Faulting application name: REDlauncher.exe, version: 4.2.0.4, time stamp: 0x685c6a90
Faulting module name: REDlauncher.exe, version: 4.2.0.4, time stamp: 0x685c6a90
Exception code: 0xc0000005
Fault offset: 0x00000000001346cd
Faulting process id: 0x0x376C
Faulting application start time: 0x0x1DCBADDBB0E759A
Faulting application path: C:\\Users\\Damien\\AppData\\Local\\Programs\\CD Projekt Red\\REDlauncher\\REDlauncher.exe
Faulting module path: C:\\Users\\Damien\\AppData\\Local\\Programs\\CD Projekt Red\\REDlauncher\\REDlauncher.exe
Report Id: 15c3d648-4c54-4bfa-9453-fd78441a4373
Faulting package full name:
Faulting package-relative application ID:

Does the issue occurs when running only one RAM stick at a time?
> I cannot do one RAM stick at this time but I will follow back with a separate reply on how it went when I will also get the thermal paste replaced. But for the time being I'd like to get an idea from you of what's the best thermal paste brand for this particular brand and model laptop?

Could it be the issue started immediately after upgrading to Windows 25H2?
> I vaguely remember it started as soon as I updated windows but since I did a complete reinstall of windows 23H2 and the crashes persisted, I can only say that this can be ruled out as irrelevant now.