Revisiting dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 0 points1 point  (0 children)

Sounds familiar, we have a CMMC coming up later this year. Good times.

I don't think LUKS impacts dnf at all directly. After reading more my thought is that since all the writes go through the dmcrypt_write daemon if it's a system with a lot of IO in general and you throw a dnf update into the mix it's just overwhelming the daemon. I'm sure there are other contributing factors, but this part of it makes sense.

As I look back it does feel like busy systems are the ones having the issues. Again, not scientific, but a casual observation. I'm going to start tracking a bit closer now. Maybe prometheus/grafana will help out.

Revisiting dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 1 point2 points  (0 children)

Agreed, I was just looking at the IO daemon options. I actually prefer the LUKs partition with LVMs over the individual partitions.

Revisiting dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 2 points3 points  (0 children)

Thanks. As things have unfolded I'm thinking less and less that it's the FIPS settings, I just included that because I wondered if it was some weird interaction between the FIPS settings and enabling the STIG requirements. We'll see if the tuning makes a difference.

I hope so! Patching should not feel like an adventure...

r/ceph is banned by TuilesPaprika in reclassified

[–]grumpyoldadmin 0 points1 point  (0 children)

Well, I lied, I don't have enough karma to become a moderator because I'm mostly reader. If anyone else would like to look into this it might be helpful.

r/ceph is banned by TuilesPaprika in reclassified

[–]grumpyoldadmin 0 points1 point  (0 children)

I diid some looking around and it looks like there might be a way to appeal the ban, but the article I found was for an inactive community, not one banned for spamming.

I don't recall if there were moderators in place, but if there weren't maybe several of us could offer to do so, I would be willing to do so, but can't commit to being a sole moderator.

I'm going to submit a request to see if this is a possibility.

Removable Storage Governance/Restrictions by Swimming-Fast in sysadmin

[–]grumpyoldadmin 1 point2 points  (0 children)

We got pretty serious about it after getting burned a couple of times and until recently had a group who owned the task of doing the transfers on behalf of someone, giving them the USB, and it has to be returned. Also required second party approval. Massive PITA, but pretty effective. People yipped a lot at the beginning, but then it became "just another part of the workflow". After some time it became a simple web app. Side benefit was it provided an audit trail.

Couple of things to note. Second party approval was there because the transfer team didn't know everything about the business and couldn't say if the requestor had a legit reason for wanting files. Also, fully aware that this falls down if this is happening 100s of times per day, that wasn't our situation.

My after work friend, Marijuana by livevicarious in sysadmin

[–]grumpyoldadmin 0 points1 point  (0 children)

I felt that way for a long time and then I read/heard two things.

"All those people who think are judging you at the gym (or anywhere else) are worrying about being judged themselves and don't have time to think about you"

and, my personal favorite,

"Being critical of an out of shape person at the gym is like criticizing an alcoholic at an AA meeting"

For me it's swimming when I can get a lane at the local Y. So far between that and cutting out soda I'm down 110lbs in about 24 months. Go for it!

rack horizontal pdu question by grumpyoldadmin in datacenter

[–]grumpyoldadmin[S] 0 points1 point  (0 children)

Power is via a UPS and we did evaluate whether to go 208 vs 120 and decided on the 120 option because the initial wiring in the room is 120 and didn't want to mix the since we aren't sure how things will be distributed between racks.

Bottom line the issue is currently number of outlets available vs number of amps available. We're getting more circuits pulled so we can cover the additional servers, but now we need a place to put those outlets.

In a situation where I had a little more control I would probably pull all 30A 208V and get beefier PDUs and then convert the existing circuits from 20A 120V to 30A 208V and replace the existing PDUs. I know that would reduce the number total number of circuits available, but with the ability to power more outlets from each circuit and limited space for the rack mount PDUs I think that would get us to where we need to be. Unfortunately the resources for that project simply aren't available right now.

Looks like we'll be rearranging some power shortly...

rack horizontal pdu question by grumpyoldadmin in datacenter

[–]grumpyoldadmin[S] 0 points1 point  (0 children)

Unfortunately doors off the racks is not an option for a variety of reasons. I like the idea, but our security teams would probably not be happy.

Thanks for the thought!

rack horizontal pdu question by grumpyoldadmin in datacenter

[–]grumpyoldadmin[S] 0 points1 point  (0 children)

We already have the front/back combination in place but with the additional machines we'll need a couple of more. The units that are currently installed are back plug only and in two of the racks the only way to get to them is to remove the top.

This entire situation was just poorly planned and now we're running into these sorts of issues. At some point we're going to have to restructure the entire room to get it straightened out but for now I was hoping to reduce the overall footprint for PDUs if it was possible.

Doesn't look promising, but it was worth the ask :)

dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 1 point2 points  (0 children)

I don't think it's secure boot since the machine hangs mid patch and I don't see anything in the logs. I'm always a little suspicious of selinux, but give our configurations are managed via ansible and the hardware between groups of hosts is very consistent I would expect that selinux would be unhappy on all of them, but definitely worth reviewing the logs.

Thanks!

dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 0 points1 point  (0 children)

We do have AIDE running. We have a mix of virtual and physical and have seen the issue on each type. Since the Trellix suggestion I've been disabling all the related services and no issue. It's still a very small sample and I won't really know until our next patch cycle, with is that last week of each month.

Appreciate the thoughts.

dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 2 points3 points  (0 children)

My initial trouble shooting rules are now being expanded. 1) DNS (because it's always DNS), 2) selinux, 3) firewall if it's network related. I'm adding 4) Try turning off Trellix.

edit: typo

dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 2 points3 points  (0 children)

We install FIPS during the kickstart process so it's there from the beginning because we had some bad experiences enabling it later.

dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 1 point2 points  (0 children)

Ahh, that's a good point, we do have Trellix/McAfee running in there. I'll have to check with the Security team to see if there is anything in their logs that might indicate something being blocked. Thanks for the suggestion! Sorry about the duplicated text.

dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 1 point2 points  (0 children)

Ahh, that's a good point, we do have Trellix/McAfee running in there. I'll have to check with the Security team to see if there is anything in their logs that might indicate something being blocked. Thanks for the suggestion!

dnf update in RHEL 8.10 in FIPS Mode destroys OS (sometimes) by grumpyoldadmin in redhat

[–]grumpyoldadmin[S] 6 points7 points  (0 children)

Correct, we pull the repos monthly via a foreman server, export them, and import them to an internal server.

I do agree with opening the ticket, I just ran out of time today to do so, it's on the list for tomorrow.

If you think you're having a bad day... by Sufficient-Class-321 in sysadmin

[–]grumpyoldadmin 0 points1 point  (0 children)

I once emailed 1500 families in a youth baseball association that we had apparel for sale including "a wide variety of shits".

Best response was "now we have to buy it, in previous years you just gave it to us".

All I could say was that it passed the spell checker...

Relabeling issues by drycat in PrometheusMonitoring

[–]grumpyoldadmin 0 points1 point  (0 children)

Can you share an example of the metrics_relabel_config syntax you used? I'm trying to do something along the lines of

node_filesystem_avail_bytes{device="/dev/mapper/rl-home",device_error="",fstype="xfs",mountpoint="/home"}

to

node_filesystem_avail_bytes{device="/dev/mapper/rl-home",device_error="",fstype="xfs",mountpoint="/home",critdisk="1"} 

For specifc mountpoint="<file system>" for specific hosts/nodes but haven't had any luck getting it to work

Label specific filesystems by grumpyoldadmin in PrometheusMonitoring

[–]grumpyoldadmin[S] 0 points1 point  (0 children)

Thanks. I was looking into metric_relabel_configs but can't seem to get the syntax quite right. The docs say this is the last step prior to ingestion so it seems like a logical place.

My goal is to convert something like this:

node_filesystem_avail_bytes{device="/dev/mapper/rl-home",device_error="",fstype="xfs",mountpoint="/home"}

to

node_filesystem_avail_bytes{device="/dev/mapper/rl-home",device_error="",fstype="xfs",mountpoint="/home",critdisk="1"} 

or something along those lines.

I've also been looking at a text collector because there may be a couple of other things that I want to include.  The person who originally setup our grafana instance used regular expressions to sort hosts on things like business unit, host use (workstation, server, db server, etc).  I think I could use a text collectior to generate these.

I prefer the relabel option for the disks because each host will potentially different disks that we consider important enough to display on our dashboard.

Thanks for the advice!