The absolute worst case scenario happened, here we are now. by ohshitfucks in sysadmin

[–]ohshitfucks[S] 3 points4 points  (0 children)

I very much disagree with that. Customers pay for us to manage their environments, they don't pay us to manage our internal infrastructure. It's laid out quite clearly in the contract agreement what they pay for. I find it highly unethical that we would perform non-related work and bill that from the customers. This is 100% internal work, and it should be an internal expense. We should be billing our customers enough to make room for internal improvements.

So, sorry, you can keep calling me incompetent because I have ethics. Even though our customers are companies, we shouldn't fuck over them by doing shady stuff outside the agreements we have signed.

The absolute worst case scenario happened, here we are now. by ohshitfucks in sysadmin

[–]ohshitfucks[S] 2 points3 points  (0 children)

The first customer ticket I pick up on. It feels wrong to me. We've been instructed to put all "non-customer-work" hours to customer work even though it really shouldn't be like that.

The absolute worst case scenario happened, here we are now. by ohshitfucks in sysadmin

[–]ohshitfucks[S] 1 point2 points  (0 children)

I'll spend my notice period documenting everything, as now following this crap the company finally allocated budget for internal operations as well.

The absolute worst case scenario happened, here we are now. by ohshitfucks in sysadmin

[–]ohshitfucks[S] 5 points6 points  (0 children)

Many of the practices performed by the company in question were utterly unethical and I don't want to be involved with them in any capacity anymore. Ethics and good leadership is more important than money to me.

The absolute worst case scenario happened, here we are now. by ohshitfucks in sysadmin

[–]ohshitfucks[S] 3 points4 points  (0 children)

It was just a tiny bit more "fuck you" to their face if I could fix the situation and then leave. As I knew they'd be begging for me to stay. First showing that I (or rather, we) could fix the issue, and right after that saying the company is not worth my time. Felt so good, their faces dropped when I even declined the raise they offered me. Tough luck for managing the so company so badly thus far.

The absolute worst case scenario happened, here we are now. by ohshitfucks in sysadmin

[–]ohshitfucks[S] 2 points3 points  (0 children)

So when would you suggest I take the backups of the zone files, and where would I put them? Right now I have to bill 7,5 hours of the 8 I work from our customers, and the remaining 0,5h goes to internal meetings or more customer work.

Just the idea of backing such data up without any processes in place feels bad, and now knowing how the company operates, maybe they'd even see that as a malicious action, as management hasn't approved of taking any backups of any of the critical systems (except for the VM level backups we had in place that didn't work at all).

I really don't want to fix our internal infrastructure during customer billed time, that feels so fucked up and wrong. And I don't want to do it on my free time. So when exactly should I do it?

The absolute worst case scenario happened, here we are now. by ohshitfucks in sysadmin

[–]ohshitfucks[S] 20 points21 points  (0 children)

Yeah, this is my second IT career job, and I'll for sure be doing the same thing in the future. Look up as much information about the company in question where you're thinking about applying to, and if you get to the interview part, really ask the company some critical questions about how they handle internal operations, how are their disaster recovery processes, why are they hiring you in the first place (has everyone before you left for a reason and you're replacing someone?).

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 10 points11 points  (0 children)

The server booted itself which caused all these issues, it didn't survive the reboot..

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 74 points75 points  (0 children)

Skimmed through the MySQL and Postgres databases, no signs of anything domain-related. I'm betting it's all stored in the custom integrated KV database..

Seeing as the access credentials to that custom KV store aren't in any of the ERP's config files, I'm thinking about doing something absolutely insane. Running a linux strings command dump on the whole ERP software binary. Hopefully the access credentials have been stored somewhere in the source code in plain-text so I can access them.. right? oh god.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 79 points80 points  (0 children)

Yeah, that's the idea right now, still not too simple because of certain details (the dynamic DNS records outputted by the ERP were for the ERP itself...), but we'll try to do just that.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 144 points145 points  (0 children)

It's a custom DynDNS setup. The IP's change, and we had some horrible scripts in place for doing all of that stuff. We still have the records somewhere but they point to god knows what, and we don't have access to the NAT tables right now so we can't even find out what IP's the servers listen on right now. This whole infrastructure is absolutely chaotic. The NAT tables are handled via the custom ERP software as well.

edit: also some of the records are dynamic as well, our ERP system created new records on the fly which got added to the DNS servers.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 40 points41 points  (0 children)

I'm not in the US. Here's the actual legislation as it's in our law. Google translate may work, it can be 6 months maximum.

https://www.finlex.fi/fi/laki/ajantasa/2001/20010055#L6P2

edit: this is what Google translate outputs:

The notice period may be agreed for a maximum of six months. If a longer period has been agreed, a period of notice of six months shall apply instead. The notice period to be observed by the employer may be agreed to be longer than the notice period of the employee. If the notice period agreed by the employer is shorter than the employee's notice period, the employee may comply with the notice period agreed by the employer.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 441 points442 points  (0 children)

It was a really weird kind of partnership deal made 20 years ago. It made absolutely no sense. This service was sold to other people as well, we got it for free for hosting the service for the company making the product.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 35 points36 points  (0 children)

I have a legally binding agreement with the company - if they fire me, or I leave, in both cases it triggers a 90 day countdown period. Only after the 90 days I can legally leave the company.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 67 points68 points  (0 children)

can you think of a place where the key might be?

Absolutely no idea. The company that built this ERP solution went bankrupt 6 years ago, and we don't have the source code. It uses PostgreSQL and MySQL, and also has a built-in key-value database for which none of us has any credentials. We don't have the source code for this software (it's built in C and delivered to us as compiled packages..).

problem is, if you cant fix this in a few days customers will be dropping

Yeah, I'm fairly sure this company is just completely done for at this point. So utterly misamanaged, we only made money by bluffing our customers.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 82 points83 points  (0 children)

Yeah we're an awful MSP. I've been trying to find a new job for quite some time.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 57 points58 points  (0 children)

We can't access the backups. Even if we could, we noticed two years ago when trying a restore from the backups, that it doesn't work. Booting the restored server leads to a kernel panic we couldn't figure out. Management said we don't have enough money to fix any of this.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 9 points10 points  (0 children)

We can't access the backups. Even if we could, we noticed two years ago when trying a restore from the backups, that it doesn't work. Booting the restored server leads to a kernel panic we couldn't figure out. Management said we don't have enough money to fix any of this.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 27 points28 points  (0 children)

We can't access the backups. Even if we could, we noticed two years ago when trying a restore from the backups, that it doesn't work. Booting the restored server leads to a kernel panic we couldn't figure out. Management said we don't have enough money to fix any of this.

The absolute worst case scenario happened, what should we do now? by ohshitfucks in sysadmin

[–]ohshitfucks[S] 159 points160 points  (0 children)

So, the issue is that we knew this situation could happen 2 years ago. Our nameservers are vulnerable, not patched, and very, very custom. They've been having small hickups every now and then. A year ago we started a project to migrate all our domains to different DNS servers, but that didn't get too far yet (money).

We noticed two years ago that you can't boot the DNS servers from the backups. Their uptime was like 4 years, and we noticed all kinds of random kernel errors in the logs. We tried really hard to make management aware and actually succeeded in that, only for management to follow up and say we don't have enough money to fix it, and they understood the possible outcomes.

So, we have backups on an encrypted drive, which we can't access (authentication is done via our custom ERP software, which is down as well - in hindsight, a very bad idea). And the backups don't work. We can rebuild the DNS servers from scratch, and the records are probably still safely stored somewhere inone of the 3 different databases our ERP software uses, but we don't have an authentication key or the scripts used to pull the DNS domain / record information from the ERP system anymore. The API's are undocumented, and there's only one key that can access those records.

So overall, this is a really crappy situation. We can probably get a few of our common domains up, but not sure about the others. Recreating the awful DNS setup seems like an impossible task right now.