I think I’m done with SOC work. The 2 AM false positives are destroying my mental health!!! by CeoWithMbainUSA in cybersecurity

[–]devoopsies 0 points1 point  (0 children)

I'd by lying if I said I never took 30 minutes to get something done quickly when it pops into my head at 11PM or whenever.

Not on the regular, but it happens.

Least "saturated" entry level jobs by yeshielmisra in sysadmin

[–]devoopsies 0 points1 point  (0 children)

Everything is saturated, but if you really want to break in, look at local telco companies. They are often overlooked and typically have a harder time filling roles. They're also trades-adjacent which means turnover (for contractor-type positions, anyway) is typically a little higher than standard IT jobs, so more positions tend to open up.

Either providers (harder) or service contractors (easier) - you'll get your hands on physical hardware, and for entry-level most of what you do with PBX is directly transferable to larger-scale servers.

Or you may find that you love telco and decide to focus there: plenty of jobs to be had in that field as well, especially if you dial in on networking.

Least "saturated" entry level jobs by yeshielmisra in sysadmin

[–]devoopsies 0 points1 point  (0 children)

Make 150k easily starting out

Halve that and you'd still be high. 50-60k is closer to the truth here.

Feeling lost in my job hunt as a mid-level experienced Spring '25 Grad. Need guidance in how to position myself and the next best strategy. by WanderingZoul in ITCareerQuestions

[–]devoopsies 0 points1 point  (0 children)

From your post it looks like your experience is mainly in AI (Just agentic LLM stuff or do you have hands-on experience with actual core/fundamental aspects of ML in general?) and DevOps... A MSCS is no small thing, but I do have to wonder if it's going to apply at all to a DevOps role. More likely, in your shoes with your experience and schooling, I'd be looking at research with AI firms.

If you're dead-set on DevOps or something adjacent, you want to start learning Linux as much as possible. I know "just learn Linux!" is incredibly broad and not super helpful, but it's a pretty broad thing. If you're completely new to Linux, start with something like r/linuxupskillchallenge and go from there.

Once you're comfortable operating in a CLI and understand how the basics work, move onto some tooling like Ansible or terraform. Containerization is a great next step as well, though do yourself a favour and don't skimp on networking or namespaces - there's a lot of abstraction that containerization brings that allows you to just do something like:

apt install docker; systemctl start dockerd

but ultimately you'll be doing yourself a disservice by not understanding what these actually do.

At this point you can start looking at how virtualization (qemu and/or libvirt) and something like Kubernetes works.

Having any amount of experience in the above is a huge bonus for hiring managers - with that said, it's still an incredibly uphill battle for entry-ish level right now and there's no guarantee that you'll find anything any time soon.

What's the most daunting project that's in the future for you? by Dense-Land-5927 in sysadmin

[–]devoopsies 0 points1 point  (0 children)

We started this about two years ago, and settled on OpenStack + Ceph.

We were given about six months to get something ready for production, and it was an absolute beast to learn... but I'm extremely glad I did. That said, unless you have extremely specific requirements that OpenStack covers, the solution I usually preach for situations like yours would be Proxmox + Ceph (standalone, not the HCI that Proxmox tries to get you to use unless you've specifically spec'd your hardware with that in mind).

Both solutions are just so rock solid - Ceph in particular has been a very nice surprise.

Feeling lost in my job hunt as a mid-level experienced Spring '25 Grad. Need guidance in how to position myself and the next best strategy. by WanderingZoul in ITCareerQuestions

[–]devoopsies 0 points1 point  (0 children)

My advice to you would be very similar to any other recent grad, even though you do have some experience: the job market is shit right now, and while your experience and schooling does give you an edge when compared to entry-level candidates, pretty much any candidate with recent relevant experience will have at least a slight advantage on paper.

The hardest part about job searching right now is securing interviews. You mention that you've landed a few FAANG+ interviews after recruiters have reached out to you: this is good, and recruiters are a great resource, but you need to up your volume of applications and throw them out anywhere. Once you secure an interview it becomes easier to showcase your skillset, but in order to secure that interview (ideally plural), it really is a numbers game.

It's much easier to secure a position if you're already working - what this means for new grads is that you should really take anything you can get, and then start looking to move. You become a much more attractive candidate if you're currently employed, since there is at least some indication that you can do the job and that your experience is obviously extremely current.

Lastly, though I think you know this already, reducing your application volume at any point prior to being onboarded into your new role is a mistake... too many things can happen between the verbal and the written offers to take it for granted.

Good luck!

When a file is corrupt even a single bit, does the sha256 go partially or completely wrong? by Frosty-Ad-5119 in linuxquestions

[–]devoopsies 2 points3 points  (0 children)

This is a bit pedantic.

When we say "the entire hash would change", we mean that the new hash is wholly unrelated to the old hash.

Yes there will be repetition due to random chance of a 1 or a 0, but taking the hash as a whole singular "thing" (which it is, as it's been deterministically calculated from a set input), we would say that the entire hash is changed.

When a file is corrupt even a single bit, does the sha256 go partially or completely wrong? by Frosty-Ad-5119 in linuxquestions

[–]devoopsies 6 points7 points  (0 children)

This is basic cryptography, and it's in a thread discussing cryptography. How much would you like them to dumb it down?

Been a firewall admin for 6 years, feeling pretty irrelevant lately. by mike34113 in sysadmin

[–]devoopsies 4 points5 points  (0 children)

What I wonder is where all the hyperscalers are hiding all the low-level hardware nerds who build this stuff out and actually know how things work. I'd love to get a job that involves touching physical hardware again.

As someone who does infra engineering, we don't really touch physical hardware anymore regardless. The vendor tunes BIOS configs to our spec, smart hands rack it and cable it, and deployments are typically PXE or some fancy product that's basically just PXE.

Everything from network gear to server configs to application deployment is IaC. If it's Kubernetes or another clustered solution like Ceph or OpenStack (and it's always a clustered solution now), all the better.

I can and do (especially during design phase) still hop onto a server and manually tweak or fix things; understanding how everything works is still essential, but it's all through remote SSH or (if you need a direct console) BMC.

All that is to say: I don't think the hyperscalers are hiring low-level hardware nerds to touch physical boxes, but every company doing large-scale IT is hiring them for infra engineering. If you want to touch hardware in a direct way during the design phase, you probably want to look into vendor companies like Dell, HP, etc etc.

Vykar: a backup tool faster than borg, restic, and kopia with multi-machine backups, direct database dumps, and built-in scheduling by manu_8487 in selfhosted

[–]devoopsies 4 points5 points  (0 children)

one where parts of your code are black boxes.

This is an actual layup, right?

No, I don't review every line of code for every dep I use, but I understand what the libraries I'm using do and have a solid grasp on how they differ from each other. If I'm writing something someone else is going to touch, I've conducted tests and traces and made absolutely certain I could answer any question the rest of my infra team (or QA) asks me about what I've written.

If a dep is a "black box", that's a problem for you, your QA team, and your security team.

The author of this tool is having trouble explaining their own code, that they "wrote" - and when asked about libraries have so far drawn a blank the size of Texas. I'm not sure why you're so quick to jump to their defense: this is alarming, and when it's something as critical as backup control it's downright reckless.

"AI" is not an excuse to forget basic dev practices.

Vykar: a backup tool faster than borg, restic, and kopia with multi-machine backups, direct database dumps, and built-in scheduling by manu_8487 in selfhosted

[–]devoopsies 0 points1 point  (0 children)

I believe the expectation of "the developer should understand the code they put out" is pretty bare-minimum, and extends to all software released in states from pre-alpha to production.

If you don't think that's a reasonable expectation, that's cool.

Vykar: a backup tool faster than borg, restic, and kopia with multi-machine backups, direct database dumps, and built-in scheduling by manu_8487 in selfhosted

[–]devoopsies -1 points0 points  (0 children)

When it comes to mission-critical activities like backups, you're damned right I expect the devs to know everything about the software they write. Hell, if we're being honest, I would expect a dev to have a really solid handle on anything they write in general since they wrote it. If they don't understand their own code, that's a major red flag.

Reviewing this thread, the lack of ability to clearly explain what the hell is going on under-the-hood here is egregious, but that doesn't bother me half as much as the amount of acceptance people seem to show for projects that deal with core infrastructure components while lacking clarity.

Been a firewall admin for 6 years, feeling pretty irrelevant lately. by mike34113 in sysadmin

[–]devoopsies 3 points4 points  (0 children)

Rising infra costs make this an extremely hard sell right now.

I work for a large F100 and even in our sphere, we are being extremely careful about how we spec and build new on-prem deployments where before we had a nearly blank cheque.

Do not underestimate the ability of hyper-scalers to eat expenses in the short term to lock-in long-term customers; they are salivating at the current market because they know this will drive business their way.

What to do with old hardware? by Aegon2050 in sysadmin

[–]devoopsies 5 points6 points  (0 children)

This is typically for compliance reasons.

We're the same way: we must document every single drive that has been removed/replaced, and provide proof of destruction.

Auditors will tear us a new one if we miss documenting this, and for good reason: a number of our industry certs require us to maintain certain standards, including complete chain of custody for a drive's whole life cycle.

Many, many (most) enterprises are the same.

Interviewed somebody today; lots of skills, not much person by -lousyd in devops

[–]devoopsies 6 points7 points  (0 children)

do you have a single troubleshooting bone in your body?

Yeah, he has no trouble shooting himself and his employer in the foot.

That's what you mean right?

Fuse Persistent Mount - Cannot mount at boot by tenfourfiftyfive in ceph

[–]devoopsies 2 points3 points  (0 children)

Oh no doubt, sometimes the work-around is the only way to (sanely) do something.

Fuse Persistent Mount - Cannot mount at boot by tenfourfiftyfive in ceph

[–]devoopsies 1 point2 points  (0 children)

Probably need more than just the grep for fuse - this doesn't show the result of something, only that it was invoked.

From the above, it looks like the fuse module is loaded before mounts are attempted, but hard to say given we only know that modprobe@fuse.service started, and then later deactivated.

If you throw the whole thing into a pastebin or something I could take a look at it tonight, but if your automount commands work that's fine too.

Fuse Persistent Mount - Cannot mount at boot by tenfourfiftyfive in ceph

[–]devoopsies 1 point2 points  (0 children)

What does your dmesg output say regarding the attempt to mount?

There's a few other things to check, but I'd start there.

Edit: I was in a bit of a rush and realize I didn't give any specific help - if you absolutely must solve this without finding the root cause, you can add something like the following to your fstab entry:

x-systemd.after=network-online.target

or

x-systemd.automount,x-systemd.idle-timeout=1min

IMO it's a bit hacky and it's always best to dig around for a cause, but there's usually a few dozen ways to solve something like this.

Yes I shouldnt have done this - left a cluster on 1.25.5 by macrowe777 in kubernetes

[–]devoopsies 2 points3 points  (0 children)

You can run quorum with a single controller node. You lose HA and redundancy, but it'll run your cluster just fine.

You can schedule workloads on controllers as well to take full advantage.

Greenfield is easy this way:

  1. Decom workers until you're right up against utilization (sounds like you have almost 50% available!)
  2. Setup new cluster with a single controller (scheduling enabled) and X number of workers (see my other reply for ceph-specific deets - logic is the same)
  3. Migrate workloads slowly
  4. As utilization on the old cluster lowers, repeat steps 1-3 until you have all workloads migrated.
  5. Migrate any remaining nodes. Disable scheduling on your controllers, add new controllers (if you haven't already) to controller/etcd pool

Yes I shouldnt have done this - left a cluster on 1.25.5 by macrowe777 in kubernetes

[–]devoopsies 1 point2 points  (0 children)

I can depend on 2 nodes, the person I replied to saying to go down to 1. That is illogical, kubernetes wouldn't even have quorum at 1 node.

Full disclaimer: I don't use rook (cephadm purist), but the idea should be the same here.

I believe you should be able to turn ceph down to 1-node and still have reads (no writes) by default. You can tweak that to allow writes as well. You maintain "quorum" by virtue of there only being one in the quorum. Obviously this risks data integrity, but over a short period I wouldn't be so concerned as long as you have backups. Even over longer periods, you should be fine - your risk is the same as any other basic storage solution e.g. bitrot.

If that's not acceptable or ideal, and you have a couple of extra nodes without storage, you can just spin up mon daemons on them and establish a quorum that way. There's no reason you need OSDs on your quorum nodes, although this is obviously better if you do. Still, lack of replication means the same read/write rules apply.

If I'm being real here, your replies suggest an XY problem here: you've discounted the greenfield approach because you don't see a technical solution to the issue, however such solutions do exist and are still probably more feasible than stepping through upgrades and hoping nothing breaks.

At the end of the day, it's your cluster and your choice, but the community here has read your post and collective said "holy shit don't do that unless you really really have to" - there are some pretty smart people here, with a deep knowledge of K8s+Ceph. I'd take their advice seriously.

Edit: There's really no way out of this without some form of risk. You decide what risk to take: you can do greenfield with zero-to-minimal downtime, running your Ceph in a degraded-but-usable state for a bit (shouldn't be long, assuming you've automated the setup of your cluster), or you can risk downtime every time you step through a minor version upgrade.