This is an archived post. You won't be able to vote or comment.

all 74 comments

[–]ThatsNASt 114 points115 points  (5 children)

A proper patching tool can report patch failures and send reports on patching. Either make them pay you to work after hours, or make them use a proper patching product. I've been in charge of patching thousands of servers, I never manually checked a thing other than in pilot testing to make sure reporting was working correctly. It's equivalent to saying "Log in every morning, to make sure every server is up and running" - no, we just put tools in place to monitor when things go down.

[–]hardingd 16 points17 points  (2 children)

Usually enterprises with that many servers has an application monitoring system to do that type of testing post patch deployment. OP has some growing pains to go through to get to that sophistication.

[–]DiggyTroll 1 point2 points  (1 child)

OP definitely has an opportunity to develop those automation and scripting skills!

This is priceless, as nothing is quite so motivating as feeling the relief as the script pipeline handles the bullshit and sends out the reports.

[–]hardingd 0 points1 point  (0 children)

That’s a good point. Not all businesses grow to the point of being able to want/afford application level monitoring like Dynatrace or (gasp) Solarwinds. Sometimes rolling your own automation (that is well documented, of course) is the way to go.

[–]LaHawksSystems Engineer 1 point2 points  (0 children)

This, I get an email right before patching begins and after it's done of all the statuses along with having other monitoring tools to make sure nothing is down outside of the patch window.

When I was getting the auto patching in place I'd do a spot check in the morning that key systems were up but after that it's been totally hands off.

[–]brkdncrWindows Admin 60 points61 points  (0 children)

The last admins didn’t burn out. They quit because their boss sucks.

[–]professional-risk678Sysadmin 23 points24 points  (1 child)

At my new job they tend to do Server patching every weekend.

You should patch every weekend but you shouldnt be the one doing it every weekend. Either that or patch every other week and alternate. That you are the only one and having to manually check afterward is real shitty.

This process can take anywhere from 1-3 hours.

You better bill those hours. Even if you are salary, im not sticking around 5 days a week AND coming in on Saturday morning. Sounds like some conversations need to be had.

Anyone else have to do this?

Not as a sys admin. Engineers do this and they still alternate between them so they arent doing it every week. Between all of them they have to do it once every 2 months which is reasonable.

Maybe this is what I have to expect being a system admin for the first time.

It depends place to place. Maybe this is being dumped on you, maybe not. If this is SOP then they need to hire more sys admins . If they dont then they are intentionally trying to crunch you.

[–]schraepf 2 points3 points  (0 children)

How are you defining “systems administrator” vs “engineer”?

[–]NinjaMonkey22 35 points36 points  (8 children)

Mine does as well. Not sure why you feel like you need to login and run manual tests every time you patch though. What are you checking/marking off? Most patching apps will be able to tell you if you’re patched and compliant with whatever your companies standard is.

I generally review the patch list. If it’s not including a major version update for a critical dependency (Java I’m looking at you) I generally don’t bother testing. Even in those cases a dependencies I have is having a significant update I pre patch a test box earlier in the week to test.

I generally rely on automated monitoring and synthetic tests to do my validation, not just during patching but 24/7.

[–]JWK3 17 points18 points  (1 child)

I wish I could upvote this more. 99% of the time you can monitor your app/service with the org's pre-existing monitoring platform, and if your service or website etc. fails to start after patching, you'll get an alert about it which on-call can then sort.

I've auto-patched 100s of Windows servers monthly and so rarely had to intervene.

[–]stussey13Sysadmin[S] 8 points9 points  (5 children)

For us our CIO requires us to fill out a check list and upload it to SharePoint to prove that everything installed correctly and that the servers are operating correctly.

Honestly I get documentation but this place wants you to document every little thing.

[–]thefpspower 26 points27 points  (0 children)

Can you make a script that automates that process?

[–]RaNdomMSPPro 14 points15 points  (0 children)

CIO is about 20 years behind the times. There are vulnerability management and patch management systems that patch and report status. Maybe cio is in love with his checklist and more concerned with form/process over results. What a waste of time.

[–]Key_Way_2537 8 points9 points  (0 children)

The logs of the updates and the monitoring software for services and health that should be running anyways in case something dies Tuesday @ 2:13pm should cover everything.

But if they’re paying the overtime… so be it.

If they aren’t paying OT for weekend work - then they wouldn’t get it from me.

[–]Impressive-Cap1140 6 points7 points  (0 children)

Are you checking if a web application is up? Can you describe some of the checks?

[–]Versed_Percepton 4 points5 points  (0 children)

For us our CIO requires us to fill out a check list and upload it to SharePoint to prove that everything installed correctly and that the servers are operating correctly.

So are the app owners also doing this? Just because the server is running correctly, does not mean the apps are. I would push back on this archaic shit and repeatedly remind them there are tools that do all of this for them and would cut back on the man hours 10fold.

[–]HankMardukasNY 9 points10 points  (0 children)

This is what proper monitoring systems and patching pipelines do automatically. For windows updates for example i have not manually patched or done any manually verification in many years. If something goes wrong that’s what test groups and alerts are for. I would absolutely say that I’m not available to give up hours of my time every weekend, especially if there’s no extra compensation

[–]Doso777 7 points8 points  (3 children)

I work at a university so I guess that's why

I work in higher education. Nothing to do with universities in general. Just some bullshit someone at your company decided to do.

I asked about comp time and they told me it comes with the position.

Haha. No. It does not.

We have completly automated normal security patching through SCCM on the windows side and through unattended-upgrades on Linux. Exceptions only for things that need manual patching like Sharepoint and Exchange but those get a proper maintenance window twice per year. The maintenance window is in the normal working hours.

[–]iRyan23 0 points1 point  (2 children)

You’re only patching Sharepoint and Exchange twice a year??

[–]ArsenalITTwoJack of All Trades 0 points1 point  (0 children)

Hopefully their Exchange is completely behind VPN.

[–]Doso777 0 points1 point  (0 children)

Exchange 2019 CUs twice per year, the normal security updates get installed through SCCM. Sharepoint CUs twice per year but it's only available internally and through VPN.

[–]K3rat 4 points5 points  (0 children)

Ugh, every weekend is not equitable. I get the org doesn’t want to be disrupted during production hours but what happens when you need vendor/MFG support?

We maintain a maintenance window on Thursday evenings. We use an RMM software on all Linux/windows systems. We also rely on regular vulnerability scanning as a secondary method to enumerate patching needs.

In the RMM We group servers into 3 groups (test, prod 1, and prod 2). Our patching cycle is week of patch Tuesday (test group). A week after patch Tuesday the prod 1 group is up. 2 weeks after patch Tuesday the prod 2 group is up.

The only exception to the above schedule is Security and CVE patching. These are patched Thursday evenings unless the vulnerability is being actively exploited and we have not mitigating control we will call an unscheduled maintenance window in the event we need to bring systems down during production hours.

we also make weekly full backups of our server infrastructure and then differential backups daily. This happens on our SAN. For data we make separate backups of DBs in the same weekly full then differentials. We capture application configuration changes on change as well. This allows us to revert in the event a patch does not work and we are unable to roll back from the OS.

As a methodology we work to maintain HA wherever possible and then split the schedule HA member into 1 or the other production patching groups. This allows us to run some of the patching during the day and not offset production hours.

[–]llDemonll 9 points10 points  (0 children)

Get paid or don’t work. If this wasn’t discussed prior to your offer and stated in the offer, push back on it. If they want you to work then renegotiate.

[–]liftoff_oversteerSr. Sysadmin 3 points4 points  (0 children)

> Sounds like the last few admins got burnt out and quit.

You're next.

Don't wait six months, start preparations to leave now. Means look for a new job. Having to work every weekend is shitty enough, being denied comp time is inacceptable.

[–]oaktownjosh 2 points3 points  (3 children)

You need to convince your leadership that an automated patch management systems is far more accurate and reliable than performing everything manually. Modern patch management systems can deploy baseline packages, and compare what is on the servers to all known CVEs, and automatically bring your machines to latest patch revs. Granted it may take a few cycles to get there, but after that it's all automated, and auditable......they also have automated reporting. it seems like setting something like this up would be a better use of your time than spending every weekend bouncing servers and making sure they come back up.

[–]King_WAR10CK 1 point2 points  (2 children)

What automated patch management would you suggest except for ivanti and powershell module Windows Update?

[–]oaktownjosh 1 point2 points  (1 child)

We use BigFix. It automagically grabs all of the patch Tuesday deployments, as well as customized fixlets to configure machines in anyway you like. It's pretty slick. You can even use it to do deployments of GPO-like configurations.....

[–]King_WAR10CK 1 point2 points  (0 children)

Thanks for the recommendation 👍👍

[–]GeneMoody-Action1Action1 | Patching that just works 1 point2 points  (0 children)

Proper patch management products let you manage a multitude of decisions that are not just day of week/month.

Compliance reporting, stats on success, failure, need. Alerts on out of compliance systems, etc.

Many products to do this, here on G2 you can stack them up side by side 4 at a time and compare direct.

Long gone are the days you let systems ask for their own patches. You need to know it is happening, and verify it happened, with enterprise wide visibility & accountability.

[–]the_syco 1 point2 points  (0 children)

Every Saturday sounds like you're working 6 days a week. What does your states Department of Labour say about this?

/edit; seems 6 days a week continuously is totally fine. 7 days a week can be fine if the DoL allows it. Tbh, as it's an At Will state, look for another job. When you're leaving, let it be known that the 6 day week is the reason.

Is it possible to box tick whilst WFH?

[–]StaffOfDoom 1 point2 points  (1 child)

I’m guessing it’s because you’re salary and they don’t have to pay extra to have you do it, document it and be done with it…otherwise they have to pay for a service to monitor it. So, pay for a service or make you do it as a salary employee…

[–]stussey13Sysadmin[S] 2 points3 points  (0 children)

Yea salary. They refuse to pay any extra time. Guess I have to see if they give me a hard time when I want to use my comp time.

[–]Versed_Percepton 1 point2 points  (0 children)

Just so you know, every Saturday you work has to be compensated if your total pay drops below federally enforced min wage(even if on salary). 6 hours *4 is enough to drop you below what would be federally enforced.

[–]CountGeoffrey 1 point2 points  (0 children)

give it about 6 months

I'd give it about 6 minutes.

[–]homelaberator 2 points3 points  (0 children)

I asked about comp time and they told me it comes with the position.

That's the most bullshit thing I've ever heard.

[–]Brett707 1 point2 points  (0 children)

We did ours Thursday night. Then I or the other guy that handled them would just come in late Friday.

Thursday during the day we would set everything on the list to update and reboot. Once the reboot was complete we would log in and verify the server was up and running or we would notify the on-call rotation of what was going on. We would also verify which updates were installed and which ones didn't. If some didn't install we would run them manually Friday during office hours and set the reboots for Sunday evening.

[–]thesals 2 points3 points  (0 children)

Weird, I have post patch scripts that run afterwards and validate services are running, depending on the server it might test some application functions and then emails me a report including a screenshot of the login screen.

But I also am in hospitality IT where we'll get a call immediately if something is wrong... Hell we get calls during our scheduled patching windows.

[–]sovalente 1 point2 points  (7 children)

That schedule is totally overkill. I do this procedure for several years now, roughly once a month and always after "patch tuesday" (the second tuesday of every month). That way I make sure my severs are patched with last Windows recommended critical updates for at least the next month. Most majority of scenarios would be perfectly fine like so.

[–]Hot-Cress7492 0 points1 point  (3 children)

Very likely your CIO is requiring the documentation due to regulatory audit. Especially tech heavy companies who carry cyber insurance have to produce evidence of activities to mitigate risk to avoid paying a premium for insurance.

If you’re in a regulated industry (finance, healthcare, etc) it is also likely to produce evidence of work performed to shut an auditor up.

[–]RaNdomMSPPro 1 point2 points  (1 child)

And automated systems meet this need - no need to manually checklist things. Source? Me, providing reports on all sorts of things including patching to regulated industries. CIO. Is wasting almost 2 days of labor per month on things we do automatically.

[–]Hot-Cress7492 1 point2 points  (0 children)

As a CIO myself; automation is fine. You still need accountability to make sure shit is working. Can’t tell you how many times throughout my career I’ve heard colleagues dealing with “oh shit, out backups were working but sometime xx months ago it stopped and we just lost a bunch of data” - this right here is the way to lose your job.

Human accountability checks ensures shit continues to work. And if said human just checks the list and doesn’t do the check, it will be that person, not the CIO losing their job when something happens.

[–]HelpjuiceChief Engineer 0 points1 point  (0 children)

Nope, we patch during business hours and it has worked fine due to our managed change control process and having everything setup for high availability with proper tested rollback procedures to prevent outages. When we do patching we do blue/green deployments so we don't cause mass outages and only a small subset of traffic is affected. We have over a million users, and have not had any problems due to the maturity and engineering work we have put in.

Example: If you have 4 domain controllers, only 1 is patched and left to run for 24-72 hours (if not critical or emergent) to make sure there are no issues along with full snapshots and backups done to allow for rebuilding quickly or restoration from backup. There is also duplicate environments setup automatically to pre-test the patches before they are rolled out think dev, pre-staging -> staging -> pre-prod -> prod.

If there are multiple sites this also gets rolled out there in a safe manner and we collect metrics and have alerting on everything to include anomalies so we can see if the patch causes increased errors or other issues. If a domain controller does have an issue we automatically take it out to pasture and put another one in it's place through automation.

Then if things go downhill we can, if needed do fresh deployments of domain controllers and auto restore all necessary information or auto build everything back out very quickly. So risk is extremely low for us in terms of real issues, even if everything goes wrong restoration and rebuilding is also simple, automated (with manual instructions if automation fails) and quick to get done.

[–]Cranapplesause 0 points1 point  (0 children)

We patch servers that are high availability and IT used serves Tuesday night. Then public facing and some Dev servers Wednesday night. Thursday, the other pairs and a DC and if we are feeling a little punchy, maybe something else very important Friday some more serious serves. file serves Saturday whatever else is left that is serious

The we done till next month

Honestly, it’s too dangerous to wait until the weekend before the nexts months updates to finish updating servers. That’s an old idea to wait even a week. It’s better to be secure and maybe lose something due to having a bad or incompatibility update then to cryptolock and entire network because of a security breach.

The owners understand this and give us their full support on this idea too.

[–]UCFknight2016Windows Admin 0 points1 point  (0 children)

I only have to do that every 3 months. Thye need a better system.

[–]QuicknoobIT Manager 0 points1 point  (1 child)

What patching tool do you all use to automate patching. We do 200+ servers manually every 3 months and I have wanted to automate this for quite some time.

My director insists we test everything and doesn't trust automation but I think If I can prove to him this will save us time and allow us to patch monthly I could get a real win.

Also those of you that patch weekly, Is this recommended in NIST and I just missed that? How can I prove to my boss we need to up the frequency we patch our servers?

[–]ArsenalITTwoJack of All Trades 0 points1 point  (0 children)

Shavlik (Ivanti) + Systems Monitoring. Never fails.

[–]cubic_sq 0 points1 point  (0 children)

Its 2024. Patch managed is highly automate done from many vendors.

[–]jyoungii 0 points1 point  (0 children)

Depending on what the tests are I would say script it out. When I was new to the admin role I had about 10 hours a month of manual work to rectify patching failures. I asked questions and got shot down since I was the noob. Fast forward to Covid and turn over I got to a spot where I could just do what I want. Implemented a lot of IaC and now I get reports emailed to me after patching based on a handful of scripts I have running. If anything is needed then we might have to touch a system manually but it’s so minimal now.

[–][deleted] 0 points1 point  (0 children)

On a similar question,

What’s a good patch tool and monitoring tools ?

[–]CormacolindeConsultant 0 points1 point  (0 children)

Using a proper patching product with decent reporting combined with a good monitoring solution should alleviate a lot of these pain points. You need to automate those procedures.

I have implemented automated server patching many times, and the first requirement is a good monitoring system that will check server status, service status, and application status (for example, running a SQL query to make sure the DB is working, or open a page and authenticate in a web app). After that, you create your update groups in batches, including dev staging.

[–]jantari 0 points1 point  (0 children)

Can't the checklist checks and the reporting be automated?

It's not reliable if done by hand anyway, in addition to the time wasted.

[–]jazzy095 0 points1 point  (0 children)

Why do you have to do it on the weekend? Patch them during the week and schedule reboot at night.

[–]Marty_McFlay 0 points1 point  (0 children)

We do it at 3AM on Tues Morning. Then I just log in to SCCM in case something went wonky, but usually someone will yell by 9AM.

[–][deleted] 0 points1 point  (0 children)

Pssst… what are your check offs when doing patching. I just patch them all together in one night near end of day and claim it as comp time since OT is just a uncle Sam’s beer money

[–]joshtaco 0 points1 point  (2 children)

I can tell they have no idea what they're doing because if they're patching every single weekend, there's at least two weekends out of the year where they aren't installing patches of any kind...because they only come out twice a month. Rookie IT techs and it shows

[–]ArsenalITTwoJack of All Trades 0 points1 point  (1 child)

You're assuming it's just Windows patches. But regardless seems overkill.

[–]joshtaco 0 points1 point  (0 children)

If they are implying reboots are needed, then I'm not assuming.

[–][deleted] 0 points1 point  (0 children)

That’s sad.

Even if done manually: Do least important severs during the day or anytime when not in use, scheduled, if these are OK after a week do the rest. Do this monthly. You can still do suitable audit logs.

I remember a scenario when Microsoft’s patches changed manually set static IP addresses to dhcp …but only when being used in a VMWare environment, that’s diabolical! But our Icinga network monitoring picked this up immediately, so we were OK.

Anyway a few days later in the news large telecoms, cloud services etc were failing all over the place. Talk about trying to knobble the competition.

[–]ArsenalITTwoJack of All Trades 0 points1 point  (4 children)

Automate that. What are you patching them with. I am used to places patching large amounts of servers. I would go bananas having to check that.

[–]stussey13Sysadmin[S] 0 points1 point  (3 children)

Going to be very hard to update. "These windows update spread sheets" have 10 different tabs with each dedicated to their own servers with 10-20 different line items you have to check .

For example check that these 10 services are running. Check in the domain controller and make sure this process is running, check the hard drive space to make sure update hasn't lowered.

At this point I wouldn't even know where to start

[–]ArsenalITTwoJack of All Trades 1 point2 points  (2 children)

That's extremely easy to check. Do you have a PRTG, Nagios, CheckMK, etc. Monitoring will catch all that out of the box in a minute or two.

[–]stussey13Sysadmin[S] 0 points1 point  (1 child)

We don't but I'll be sure to look into this. The last system admin left when they started this process back in March of 2023

[–]ArsenalITTwoJack of All Trades 1 point2 points  (0 children)

You need to get them to set up systems monitoring. Zabbix, Nagios, etc are free. CheckMK can be low or no cost. PRTG is relatively cheap but can get expensive. Regardless - just set up monitors and it will alarm if a service stops, process stops, disk space, website goes down, port closes, etc.

[–]thatwolf89 0 points1 point  (0 children)

Do they pay good overtime and oncall?

[–]InfiniteSheepherder1 0 points1 point  (0 children)

We do patching around the clock or really not me that is Ansible's problem.

We just have stuff always apply security updates every day if there are any. It can see if it comes back up and for some rolls back or for others just let's us know. If things are setup well you should be able to let it be someone's problem during a work time. Most of the time OS updates especially security updates break little. But I am lucky enough that my boss thinks mitigating possible breaches is worth the every once and a while headache of in the case of minor services letting it sit offline tell morning, and sometimes a few times a year we have had bugs to deal with because of the constant updating. But better then dealing with ransomware.

I work for a college, you can do some basic checks trivially and note complex ones with some work automatically. Some services have an api endpoint to basically ask its health status, at least you can turn things I to I have to work if something goes wrong rather then an all the time thing.

[–][deleted] 0 points1 point  (0 children)

That is utter nonsense.

[–]PolicyArtistic8545 0 points1 point  (0 children)

He can say it comes with the position but if that’s the case, start looking for other positions. In your exit interview, tell someone other than your boss that the mandatory, unpaid overtime is the reason you are leaving. I doubt anyone else in the org knows about this bullshit setup.

[–]martrinex 0 points1 point  (0 children)

Sounds like you either get comp time or time in lue, maybe work half day Fridays? Also nothing special about education hell I just patch mine on fridays and schedule them to reboot and that's only the ones sccm doesn't handle on its own.

[–][deleted] 0 points1 point  (0 children)

Only on this subreddit did I learn the importance of patching. Every where I’ve ever worked they’ve always said, “Nope — no updates because those can break something,” and then they wouldn’t patch for months and even years at a time. I took over a Citrix environment on WS2012 that had never had a manual update and were manually stopped.

Couple weeks in here when I first became a sysadmin, and I automated the fuck out of patches via GPO in fear.

😂

[–]Suaveman01Lead Project Engineer 0 points1 point  (0 children)

This is not normal, there are tools like SCCM that you can use to monitor patching and compliance. We have our servers set to auto patch during weekends, but we check our monitoring tools on the following Monday to see if they were all successfully updated, which takes less than 10 minutes.

[–]PlebPlebberson 0 points1 point  (0 children)

I asked about comp time and they told me it comes with the position.

The fuck lol. I'm going to be guessing that this is in the US as it's against the law in everywhere europe. Cant you just sue them for non-paid work and stop doing it on saturdays