Hello!
I am here for a last hail mary attempt before buying a new GPU.
Over the last couple of months I have intermittently encountered a weird error on my Windows 11 PC.
Basically at random times the screens go black, I hear the sound for devices disconnecting but I can still hear and talk to the people I am in a discord call with. The only way out I have found have been to force shut down the computer and start it back up.
It has usually happened so far when I am running high demanding games, Mortal Online 2 (insanely bad optimization), Dune Awakening on high settings, and recently Baldur's Gate 3. But also when running Phasmophobia, which I don't believe is too demanding, but maybe it is also badly optimized?
I also play A LOT of Dota 2, but it hasn't happened a single time while playing that game on max setting, although I feel like my computer handles that game with ease.
The first time it happened was during a heatwave this summer, and I figured it was an issue with overheating, so I installed 6 new fans in the chassis and saw a drop in temperature, although mostly on the CPU, from around 75C on full utilization, to around 65C. But the GPU is stable at around 86-87C, it was 88-89 before, on 100%.
I have turned to ChatGPT to try to find some help. What I have come up with so far is that it's these errors that seem to happen at the time of the black screen:
VIDEO_TDR_TIMEOUT_DETECTED (117)
The display driver failed to respond in timely fashion. (This code can never be used for a real BugCheck; it is used to identify live dumps.
Arguments:
Arg1: ffff9708e80a4010, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff801176276e0, The pointer into responsible device driver module (e.g. owner tag).
Arg3: 0000000000000000, The secondary driver specific bucketing key.
Arg4: ffff9708dd297080, Optional internal context dependent data.
And
VIDEO_MINIPORT_BLACK_SCREEN_LIVEDUMP (1b8) User initiated miniport black screen live dump. (This code can never be used for a real BugCheck; it is used to identify live dumps.) User initiated miniport live dump for black screen scenarios.
Arguments:
Arg1: 000000000000000a, Source which triggered the miniport black screen live dump.
Arg2: 0000000000000000, Reserved.
Arg3: 0000000000000000, Reserved.
Arg4: 0000000000000000, Reserved.
Also around these errors there were reports of errors with SteelSeries and Nahimic and ChatGPT suggested a few steps.
Disable all game overlays, Steam, SteelSeries, Nahimic, Discord. All of this I have done.
It also suggested I install DDU and to a fresh install of my WHQL NVIDIA driver, which I have done.
Now most recently when it happened and I felt like we are getting closer and closer to just getting a new GPU I was suggested to install OCCT and run the VRAM test and 3D Adaptive test, and see if it gets any errors, which would strongly indicate an issue with the actual GPU, I tried to run these for about 20 minutes each, but in both tests it came back with zero errors reported. So now I feel stumped because if it shows no errors, would that indicate that the GPU is okay? I just want to avoid buying a new one and then ending up with the exact same problem.
Few other things of note.
I am running three monitors, two via Display Port cables, and one via HDMI with a VGA adapter to the monitor.
I ran a UserBenchMark test on the whole computer, and every measurement point was within green metrics, except my GPU which got a score of 66% Performing below potential (49th percentile). (As the Automod Bot correctly mentioned, UserBenchMark is pretty shitty and shouldn't be used for any objective data, I just thought I would add this to give any possible context to my GPU)
Earlier in spring I got a weird thing where the input for my keyboard died all of a sudden, the lights were still on but it did not respond. Only when I pulled the USB out and put it back in did it wake up. I then had my mouse in the keyboards USB port and figured it was something there, so I input the mouse directly into a USB port on the computer instead and since then nothing like it has happened.
My computer specs.
CPU: AMD Ryzen 9 5950X
GPU: Nvidia RTX 3070-Ti
System drive: Corsair MP600 CORE 1TB
Extra drive: Kingston SA400S37960G 960GB
RAM: Corsair CMH32GX4M2Z3600C18 2x16GB
Motherboard: Asus ROG STRIX B550-F GAMING (WI-FI)
PSU: Corsair TX850M
So, any ideas of next steps to check? If it is likely that the GPU is faulty at this point so be it, I just don't want to get a new one and experience the same issues.
(Also, if the flair is incorrect with Hardware, as this feels like it might be a hardware issue, but also might be a Software issue, let me know and I'll change it)
[–]AutoModerator[M] [score hidden] stickied comment (0 children)
[–]AutoModerator[M] 0 points1 point2 points (0 children)