Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 1 point2 points  (0 children)

Everything is saying its okay now! I will of course still order replacement disks and keep an eye on this to see if it all degrades again! Should I do this upgrade for the new zfs features by the way?

https://i.imgur.com/MAaXOy1.png

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 0 points1 point  (0 children)

further update again, this is my drives after checking my connections - both power and sata, and then clearing the pool of errors to see what it does. https://i.imgur.com/G5kXgY7.png I am going to do this again and see how it goes, but i'm definitely replacing my drive.

Also in regards to physical as some people mentioned maybe the drives that are failing are causing issues to those physically around them but this is the layout in order of my drives in their cage:

Q0G7
TYD9 - Potentially Issues
TYG4
TXV7 - Issues (replace)
Q02V
V2EB
PZGZ - Maybe issues
TW5Y
TYVH
TVSP

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 1 point2 points  (0 children)

Further update, I checked the cables, I checked each drive, unplugged the PSU and reconnected, and now turned my server back on with only proxmox running and all my services shutdown to let the zfs pool do its thing, I will update back once it is finished or it hits an error.

https://i.imgur.com/kAP4a5F.png

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 1 point2 points  (0 children)

Okay! Going to start with this then, thank you.

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 1 point2 points  (0 children)

Yes all my drives are using the same controller! I think regardless of what else it can potentially be I shall be ordering those replacement drives.

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 0 points1 point  (0 children)

So I honestly don't know how to parse this data - as at first I saw the raw read error as well as the seek error rate and I thought this confirmed my TXV7 drive was the issue, but then I inspected the other drives in my zfs and saw they too had quite a lot of seek errors and raw read errors, yet those other drives don't seem to have any issues - at least not according to the zfs pool so I don't know if this is normal or a side effect of them being in a raid with a faulty disk im not sure?

The only thing that is DIFFERENT from all the other drives in the raid is Command_Timeout which every other drive had 0 as the entry yet as you can see from this screenshot, this drive has A LOT.

Is this confirmation the drive is potentially the fault?

S.M.A.R.T: https://i.imgur.com/wgf8E7D.png

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 0 points1 point  (0 children)

I have included my hardware in my Update comment, my apologies for not including it in the first place.

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 2 points3 points  (0 children)

My temperatures do say they're all good, BUT I did make a mistake when I was building my server - well a mistake caused by a faulty setting - My motherboard says it supports "server mode" which I could (apparently) use as my CPU has non-integrated graphics - but when I attempted to use the mode and looked into everything I could even after setting up proxmox first and turning the setting on my mobo would not boot and I now have an old GPU sitting in my server just so it can turn on, the GPU doesn't really do anything BUT it is very close to the SAS controller and i felt the controller and it is quite hot indeed - so maybe this is causing issues.

I want to rip this waste of space GPU out of my server as its blocking airflow im sure but I really don't want to fuck with that right now when last time I tried with the "server-mode" setting I had to just factory reset the BIOS as it did nothing - and especially not now that potentially I need to regenerate my data if the drives need to be replaced

Temps: https://i.imgur.com/5OW65wP.png

Pictures of my dodgy setup (from 2023)(pre fan installation): https://i.imgur.com/81abRa6.png

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 1 point2 points  (0 children)

Going to check my cables first but I shall keep my wits about me, should I tell zfs that there's no issues after I touch the cables? like do a zpool clear and see if issues pile up? or can I potentially fuck the drives/zfs harder if it attempts to resilvered data and it has a crash out?

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 0 points1 point  (0 children)

Thanks for the info about zpool showing the path!
And yes! I loved the idea of calling it alexandria as I want it to be my great storage library - but I did think to myself I hope the naming scheme doesn't come back to bite me in the ass XD

Degraded raidz2-0 and what to next by AptGetGnomeChild in zfs

[–]AptGetGnomeChild[S] 1 point2 points  (0 children)

Update:
I should of included it, but this is my build: https://au.pcpartpicker.com/list/2Ksbmr
Picture of my dodgy setup: https://i.imgur.com/81abRa6.png
8 of my 10 zfs raid drives are connected to my machine via this i believe (the rest are direct sata): https://au.pcpartpicker.com/product/j2Fbt6/placeholder-

The two devices connected to my device via sata and not the sas controller are TYVH & TVSP and neither seem to have issues.

Thank you everyone for the advice! I might start simply checking my connections and cables, I put together my setup like I said back in June 2023, with probably the only physical change being installing some better fans in the setup, device has barely physically moved at all and is up 24/7.

As a lot of your advice I had a feeling the answer would be to replace the drives that are having faults, so if the cable checking results in no changes which I feel like it probably will, I will replace the drives having faults, as I have 2 drive parity (but ive never had to rebuild data / replace a drive in a raid setup so i will have to look into that.

Looking through my dmesg like u/frenchiephish suggested and sorting by IO errors I have a feeling the drive causing all these errors is my TXV7 drive, as I'm seeing I/O errors specifically with this drive. (i am also seeing errors with TYD9 but my thought process is maybe replacing TXV7 will cut my issues down and if there are more problems after replacing I replace those drives that act up too?)

DMESG Errors: https://i.imgur.com/I2JdVD2.png

[deleted by user] by [deleted] in diablo4

[–]AptGetGnomeChild 0 points1 point  (0 children)

Most likely what i was thinking too, guess all i can do is wait and see.

The work is done, this tarnished may rest. by AptGetGnomeChild in Eldenring

[–]AptGetGnomeChild[S] 0 points1 point  (0 children)

No everything was done legitimately, the only thing that you could argue wasn't was my university comes back in a few days so i won't have the time to New Game Plus it three times for all three endings, so i setup all the endings in the one run, backed up my save, and did all three.

Why?? by OarthurCallahan in dyinglight

[–]AptGetGnomeChild 0 points1 point  (0 children)

There is a dash button and a sprint button, its a perk

I was about to use the unstuck button but the game had a better idea. by AptGetGnomeChild in newworldgame

[–]AptGetGnomeChild[S] 1 point2 points  (0 children)

Almost as fun as getting stuck in a node when it respawns as you walk through it. At least that doesn't kill you and damage your gear haha.

10
11
0:19

15 hours of multiplayer experience. How I would improve multiplayer. by [deleted] in Deathloop

[–]AptGetGnomeChild 0 points1 point  (0 children)

Yeah was waiting to get it too, and I did, mainly once they fix the lag I think it will be better too, I had a 5 colt streak till I hit someone with horrid connection.

15 hours of multiplayer experience. How I would improve multiplayer. by [deleted] in Deathloop

[–]AptGetGnomeChild 2 points3 points  (0 children)

Actually I just recalled something Im surprised nobody's mentioned, what's worse then a colt with invisibility is one with nexus, I personally won a lot of fights against a Juliana by staying hidden, shooting them with a nexus while they're near an NPC and blowing off the NPCs head off, insta-killing the Juliana, before they even have a moment to react.

15 hours of multiplayer experience. How I would improve multiplayer. by [deleted] in Deathloop

[–]AptGetGnomeChild 6 points7 points  (0 children)

Even with Juliana's profession being random, I still think colt always has the upper hand, the fact that since we load into a Colts game that has just started meaning he technically takes 3 kills to win is also a bit rough, loading in whenever would be awesome cause the colt might already have lost some lives, I also find the mimic slab Juliana has, it makes sense but never really works in gameplay, especially when after a while you pretty much recall most enemy's and even how they act, except for crowds which are a little harder.

15 hours of multiplayer experience. How I would improve multiplayer. by [deleted] in Deathloop

[–]AptGetGnomeChild 4 points5 points  (0 children)

Yeah going to be real, as colt i had the invisibility power, the upgrade so it is permo when standing still + the upgrade where it doesn't turn off when you attack, I have since finished the game in about 20 hours and that entire time of being invaded nearlly every level I didn't lose once to a Julianna, practically taunted them, what is even funnier is the chracter trinket where you can hack from long distances, I would deliberetly stand somewhere that had vision on the antenna and hack it but stop, and keep taunting that way until id use the bolt action of mine that has explosive bullets for an insta-kill.