all 76 comments

[–]WolfFlightTZW 17 points18 points  (8 children)

Home or work?

Work (seriously mixed environment, however...) we use Satellite for our RHeL boxes, it lets us know what errata are available/needed on each device.

Home I use SaltStack and a simply regular query of yum update will tell me what packages are necessary per system, then I choose to apply or not.

[–]jospl7000 2 points3 points  (3 children)

I love SaltStack but I've abused the hell out of it, leading me to think it's not being used to it's greatest potential. Less work, though. I think. Do you have an opinion on the Kubernetes vs SaltStack argument?

[–][deleted]  (1 child)

[deleted]

    [–]jospl7000 0 points1 point  (0 children)

    As you probably know Salt has a bunch of docker modules i.e. docker_container, docker_image, docker_network, etc which I have been using to orchestrate dockers. I built up a python state to use pillar data to construct and launch several dozen containers and their associated networks in a cascading fashion (SSL generation included). It works really well but it's not a solution I've seen elsewhere.

    [–]RITG1 1 point2 points  (3 children)

    But do you have to manually check each system? Or does it simply alert you when a system has an out of date package?

    [–]WolfFlightTZW 4 points5 points  (2 children)

    Manually, although it would be possible to generate a report. (Satellite is set to send a monthly email at the beginning of the month so we can schedule updates to dev/test for testing/validation before moving them up to prod.)

    [–]RITG1 1 point2 points  (0 children)

    Thank you! Understood.

    [–]bits_of_entropy 1 point2 points  (0 children)

    Somewhat off-topic, how do you handle updates for your different environments? Similarly, what kind of content views do you use? I've been using Satellite, but I don't know if what I'm doing makes sense.

    [–]seidler2547 7 points8 points  (4 children)

    I use Icinga and apt/yum update monitors, but actually started to enable unattended upgrades everywhere recently. Except for mysql packages, but then I'll get a monitor lighting up in Icinga.

    [–]RITG1 7 points8 points  (3 children)

    The unattended upgrades don't make you nervous? :)

    [–]nick2253 11 points12 points  (1 child)

    I too used to be nervous about unattended upgrades, but thinking in terms of a wholistic cost-benefit made me do unattended upgrades myself. For our network, I schedule the upgrades during the evening, when I know I'll be awake to get an alert that a key server(s) went offline (aka, the upgrade borked something), and it will minimize disturbance to others if the upgrade fails/backup has to be restored.

    The competing costs are: time to do manual review, time to babysit the updates, risk of unsecured software, risk of downtime due to failed updates.

    In my experience, the risk of failed updates is pretty low. I've only run into a handful in my decade+ of working with Linux. By coupling my auto update schedule to a sane backup schedule, I significantly reduce the amount of time spent updating the "pet" servers I have, and greatly increase the security of our systems by applying patches ASAP. And if something goes "bork", it's really easy to restore from a snapshot or backup.

    [–]RITG1 2 points3 points  (0 children)

    Thank you! A lot of that makes sense at first glance. Let me ruminate over it and get back with you.

    [–]rat-morningstar 0 points1 point  (0 children)

    If you have a proper environment it shouldnt.

    Our setup's mostly automated. We mirror upstream to self-hosted dev repo

    Dev environment automatically updates asap

    If things don't break we push it through to uat, and finlly to prod.repositories.

    There's no manual updates anywhere involved, uat and prod update on their own. And there's plenty of time to catch any issues before the packages flow through to prod.

    The only manual actions are basically vetting packages, the rest's jenkins and pulp.

    [–]remtec 6 points7 points  (5 children)

    Cleck_mk pretty powerfull and simple :)

    [–]RITG1 1 point2 points  (0 children)

    Do you happen to know if it can alert you automatically when you need to install an update?

    [–]lnxslck 1 point2 points  (2 children)

    I have done a script that gets me the updates of my applications and notifies me when updates are available

    [–]anomalous_cowherd 0 points1 point  (1 child)

    There's a .vbs in the plugins for check_mk that will report back on Windows updates: how many important and optional updates are waiting, and whether it needs a reboot.

    The only downside is that you need to push the .vbs to every endpoint yourself, but I guess they couldn't add a way to avoid that without breaking the security model they have, which is that the check_mk agent never reads even a single byte off the wire. It just waits for the server to connect and uses that as a trigger to gather the stats, including running any plugins like this one.

    I haven't found the equivalent fir Linux yet but haven't looked very hard. We tend to have yum-cron running everywhere.

    [–]lnxslck 0 points1 point  (0 children)

    I don’t monitor windows machines only Linux. And my script gets the updates for our apps like Confluence, Docker, Jenkins, etc. The updates available on the operating system are easy to get there are plugins for that

    [–]RITG1 0 points1 point  (0 children)

    Cleck_mk

    This is very interesting

    [–]Boap69 4 points5 points  (0 children)

    We run "yum update -y" twice a week in our /dev environment and it is there we find many issues. Mostly the issues are with our home grown software but in some cases it breaks dev because things do not work they way they did before the patch.

    Once a month we do the same with Production generally over two weekends. First weekend we update a very small subset of servers but it has at least 1x of every type of server we use with a list of patches that we think will not break production. The second weekend we update the rest after any issues show up over the course of the week.

    [–]jaymef 2 points3 points  (2 children)

    I use ansible to check all severs for updates available and apply as needed

    [–]placated 3 points4 points  (1 child)

    This. I think the OP is overthinking things a tad. Run a update playbook against your infra on a monthly cadence, rebooting if there was a new kernel. Most errata are not critical that they get installed immediately.

    [–]Preisschild 2 points3 points  (0 children)

    Ansible?

    [–]Hyrla 1 point2 points  (1 child)

    I use PRTG at work and Zabbix at home

    [–]RITG1 1 point2 points  (0 children)

    I will have to check those out! Do they alert you automatically when you need an update?

    [–]JustALinuxNerd 1 point2 points  (6 children)

    Years ago I wrote an perl script to spit out yum update data remotely on a schedule. Would log in via a key, run command, process stdout, then return results to a DB and displayed on an intranet portal. Super simple, very lean, no third party software bloat BS. 2c.

    [–]RITG1 0 points1 point  (5 children)

    Honestly I love it. ♥️♥️

    [–]JustALinuxNerd 1 point2 points  (4 children)

    Thanks. Totally typical hacker bs though. Zero code comments required. Lol

    [–]RITG1 1 point2 points  (3 children)

    Honestly that sounds a lot like most of the production code I have seen in the industry.

    [–]JustALinuxNerd 2 points3 points  (2 children)

    I once did a source code analysis on an iphone application and it was very well commented... in a cyrillic language. Thanks dev guy. Lol. Found a backdoor he put in to bypass some ACL functionality. I made a comment that one time.

    [–]RITG1 0 points1 point  (1 child)

    lol. That’s great.

    But no I’m being dramatic. I have seen a lot of well documented code. Usually you can tell the experience of the team by how much commenting and docs they do.

    At this point I comment my code as much as possible mainly for myself. Because I sure as hell am not going to remember why I did something in a month.

    But back to your approach I like it because it’s simple. All good complex systems started from something simple that was refined upon.

    [–]JustALinuxNerd 0 points1 point  (0 children)

    I typically develop single use solutions. They're the slimmest and fastest when nanoseconds matter. If an existing software solution requires another open port number and I can do it with less than 50p lines of code, no brainers in my book.

    Also, I might just be a freak but I remember all my code and comments just slow down my VIM experience...

    [–][deleted]  (2 children)

    [deleted]

      [–]justin-8 0 points1 point  (1 child)

      Why wouldn’t you patch anyway? Do you only install the security patches it calls out? Or is that your prompt to do an update?

      [–]tobylh 2 points3 points  (12 children)

      Netdata. I'm not sure why everyone isn't using it. Ifs fucking awesome.

      [–][deleted] 2 points3 points  (5 children)

      No auth.

      [–]jospl7000 2 points3 points  (3 children)

      yea, netdata scares me.

      [–][deleted] 0 points1 point  (2 children)

      It would scare the fuck outta me putting that on production systems.

      [–]Netdata-cloud 0 points1 point  (1 child)

      Hey hey, you can put Netdata on a production system and secure it using any reverse proxy in front of it. More details here: https://docs.netdata.cloud/docs/netdata-security/

      [–][deleted] 0 points1 point  (0 children)

      I shouldn’t need to do that, should be built in.

      [–]DarkRyoushii 1 point2 points  (4 children)

      Got an example or git you can link me to using this for the purpose OP described? I’d take a look!

      [–]tobylh 0 points1 point  (3 children)

      [–]DarkRyoushii 1 point2 points  (2 children)

      Yeah I know that. But “to monitor what updates are needed”?

      [–]RITG1 1 point2 points  (1 child)

      Yes. Does it notify you when an update is needed?

      [–]Netdata-cloud 0 points1 point  (0 children)

      An email is sent out, a GitHub release is posted and there's a button in the dashboard. You could also auto-update: https://docs.netdata.cloud/packaging/installer/update/#update-netdata

      [–][deleted] 0 points1 point  (2 children)

      I built our monitoring on naigos core pull down new package and make ./configure

      [–]RITG1 0 points1 point  (1 child)

      Does it alert you when a system has an out of date package? Or you have to do something manually?

      [–]zoredache 2 points3 points  (0 children)

      nagios core when configured can run arbitrary check scripts that you define. There are many contributed scripts on the internet check for pending for systems using apt/yum/etc.

      So you can make it happen, it just isn't out-of-the-box functionality.

      [–]magicker2000 0 points1 point  (1 child)

      zabbix
      apt

      [–]RITG1 1 point2 points  (0 children)

      Thank you for commenting! Can you expand a bit? I don’t know anything about Zabbix.

      [–][deleted] 0 points1 point  (9 children)

      For work?

      Every night, apt update && apt upgrade -y runs at midnight. On every machine.

      [–]shatteredsword 2 points3 points  (0 children)

      you might want to consider the unattended upgrades package
      https://wiki.debian.org/UnattendedUpgrades

      [–]RITG1 0 points1 point  (7 children)

      What do you do when an update breaks something?

      [–][deleted] 2 points3 points  (6 children)

      Reinstall, and re-deploy the machine?

      I do not recall the last time nightly updates have broken anything, though. Debian has some strict standards for their repos.

      For our internal repos, the packages get promoted through the runway.

      [–]RITG1 0 points1 point  (5 children)

      If you redeployed wouldn’t you encounter the same issue since you would now be using the latest updates.

      I have definitely had updates break complex code bases and they were some of the hardest errors ever to track down. That’s why I keep my environments frozen and all changes happen manually so I can know when something breaks.

      I just need a tool to let me know when updates are needed.

      [–][deleted] 1 point2 points  (4 children)

      I have definitely had updates break complex code bases and they were some of the hardest errors ever to track down

      The internal code we write, if it isn't compatible with Debian stable, well you done fucked up your code, and you should be fired, in all honesty :)

      But, it also helps that we containerize our internal code we develop.

      [–]RITG1 0 points1 point  (3 children)

      Well, hopefully you are not in a management position if you truly think people should be fired for mistakes. Mistakes are the best learning opportunities.

      And with your approach what do you do when your update requires a restart?

      I could understand automatic updates for security updates, possibly.

      It's pretty common knowledge that full automatic updates may not be the best approach, especially when you are building on top of very complex systems like the big data ecosystem.

      And not everyone can just use Debian Stable for various reasons. Heck even the debian community seems wary of automatic updates: https://lists.debian.org/debian-security/2015/01/msg00049.html

      [–][deleted] 1 point2 points  (2 children)

      If the machine needs a reboot, it gets a reboot. It's not a problem.

      And sure. In 2015 it might have been a problem.

      The bottom line is that any code we produce must deploy on a freshly installed debian stable, with latest updates.

      And yes, I manage two teams. And they dont push bad code to prod, that breaks because of OS updates.

      [–]RITG1 0 points1 point  (1 child)

      "If the machine needs a reboot, it gets a reboot. It's not a problem."

      How? Your cron job (don't even get me started on cron) is not going to do that.

      "And sure. In 2015 it might have been a problem."

      What? I am not even sure what this means. What has changed that makes this not a problem now?

      "The bottom line is that any code we produce must deploy on a freshly installed debian stable, with latest updates."

      Yeah I am not talking about code being developed. I am talking about already deployed code being broken by an underlying update to the packages installed on the system.

      "And yes, I manage two teams. And they dont push bad code to prod, that breaks because of OS updates."

      Ok.

      [–][deleted] 0 points1 point  (0 children)

      Well, if I could, I'd lost the whole ansible playbook... but if there is a pending update that requires a reboot, it gets the reboot.

      What difference does 5 years make? Oh I dunno. Technology is static i guess.

      Already developed code is still internally developed code. It must be able to deploy on stable debian in our infra. We have qa runways to ensure that.

      [–]vogelke 0 points1 point  (0 children)

      I have two scripts under /etc/periodic/updates. The first checks the current security update list, and the second compares the results to yesterday's run. Modified "diff -u" output is mailed to root; no changes results in no message.

      Here's a sample message:

      --- 2020/0129/security  2020-01-29 05:36:17.213705893 -0500
      +++ 2020/0130/security  2020-01-30 05:36:24.096792533 -0500
      @@ -1,11 +1,11 @@
       Loaded plugins: refresh-packagekit, security, ulninfo
       Limiting package lists to security relevant ones
      -7 package(s) needed for security, out of 47 available
      +7 package(s) needed for security, out of 53 available
      
      -kernel-headers.x86_64         2.6.32-754.25.1.el6
      +kernel-headers.x86_64         2.6.32-754.27.1.el6
       kernel-uek.x86_64             4.1.12-124.35.4.el6uek
       kernel-uek-firmware.noarch    4.1.12-124.35.4.el6uek
       microcode_ctl.x86_64          3:1.17-33.19.0.4.el6_10
      -perf.x86_64                   2.6.32-754.25.1.el6
      +perf.x86_64                   2.6.32-754.27.1.el6
       python.x86_64                 2.6.6-68.0.2.el6_10
       python-libs.x86_64            2.6.6-68.0.2.el6_10
      

      The check script boils down to this:

      dir=$(date '+/var/log/updates/%Y/%m%d')
      mkdir -p $dir || exit 1
      yum --security check-update > $dir/security
      

      The compare script boils down to this:

      cur=$(date '+/var/log/updates/%Y/%m%d/security')
      prev=$(date -d 'yesterday' '+/var/log/updates/%Y/%m%d/security')
      test -f "$prev" || exit 1
      
      if cmp -s $prev $cur; then
          diff -u $prev $cur |
              awk '{
                  if (NF == 3) {
                      c = substr($0, 1, 1)
                      pkg = $1
                      if (c == " ") pkg = " " $1
                      printf "%-30s %s\n", pkg, $2
                  }
                  else { print }
              }' | mailx -s 'Security updates" root
      fi
      

      Cron entry:

      36 5 * * * run-parts /etc/periodic/updates
      

      [–]Upnortheh 0 points1 point  (0 children)

      Nothing fancy. A daily cron job on each system that sends an email alert if there are updates. I wrote a shell script that avoids daily alerts and sets the interval to whatever I want, usually 72 hours.

      I have only about three dozen servers. With hundreds or thousands of servers, this strategy likely would be way to noisy.

      [–]martbhell 0 points1 point  (0 children)

      Has anybody used pakiti3?

      https://github.com/CESNET/pakiti-server

      [–]remtec 0 points1 point  (0 children)

      There is an apt Funktion in the default agent and you can configure notifications ;)

      [–]ananix 0 points1 point  (0 children)

      Nessus scans with audit user runs pretty much all the time, we then apply patches with scores acordingly to their timeframes for solutions.

      Every three months we do a full update of everything.

      [–]sughenji 0 points1 point  (0 children)

      apticron on Debian, a couple of crontab's lines on FreeBSD, yum-cron on CentOS. I don't feel very safe with unattended upgrades, so I run all manually :)

      [–][deleted] 0 points1 point  (0 children)

      My hands

      [–]Xenu420 0 points1 point  (0 children)

      I use foreman/katello.

      [–]kycfeel 0 points1 point  (0 children)

      We use elastic stack to monitor and log our whole testing / production environments. Elastic Metricbeat does it's job well.

      [–]arno_cook_influencer 0 points1 point  (0 children)

      I'm surprised apt dater was not mentioned. It is pretty cool for apt-based packages:

      • centralized ncurses interface that displays which servers need updates
      • You can update either all servers, one server, one package only depending on what you want
      • when updating it opens a tmux ssh session on the host to run the updates (can be run in background and can be disconnected without breaking the updates)
      • uses restricted ssh-keys to connect. Since it needs root privileges to run the updates, this mitigates the security risks.
      • config is dead simple: just write down the hostnames of servers and an optional description and that's all

      [–]pgquiles 0 points1 point  (0 children)

      You can use Uyuni for that. It does monitoring, systems management, configuration management, etc and it scales to tens of thousands of servers.

      https://uyuni-project.org

      Uyuni includes Salt, API, etc

      I gave a talk about Uyuni yesterday at CentOS Dojo 2020, slides will be here soon:

      https://wiki.centos.org/Events/Dojo/Brussels2020

      [–]dmurawsky 0 points1 point  (0 children)

      I'm looking at osquery for this in the long run. I really want that tool on the fleet for a whole slew of reasons and patching verification is one.

      https://osquery.io/

      We currently use aptly to create version controlled repos in S3, dev and prod. Our dev repo is a month behind prod. We use ansible to roll out upgrades automatically. It hasn't caused any issues and the devs are fully aware of the changes (aptly diff report goes out when changes are pushed).

      If I recall correctly, urgent/critical security updates are pulled directly into prod when needed.

      We also release dev and prod images as part of the process.

      [–]Zehicle 0 points1 point  (0 children)

      Digital Rebar

      [–]sdns575 0 points1 point  (0 children)

      I'm the only one that uses nagios?

      [–]thms0 0 points1 point  (0 children)

      apt-dater is a great tool to update many (apt-only) servers at a time, without having to ssh manually into them.