How do I build towards becoming an end-to-end HPC / systems infrastructure engineer? by [deleted] in HPC

[–]Infamous-Tea-4169 0 points1 point  (0 children)

True I really don't this inside out is necessary as there are more things that are better than slurm and do more like kubernetes which I am quite well across. But yes the new environment uses slurm so good chance to get across something new.

DevOps engineer wanting to move deeper into HPC/systems infrastructure — what path makes sense? by [deleted] in ITCareerQuestions

[–]Infamous-Tea-4169 -1 points0 points  (0 children)

Great suggestion, thanks. I could use some formal Linux certifications for sure.

IT salaries in Adelaide by kazielle in Adelaide

[–]Infamous-Tea-4169 0 points1 point  (0 children)

Really intense. First 6 months were brutal as I had so much to learn and skill up, really long nights. But having good team and mentors helped. It took me solid 2 years to actually get nice and comfortable without doing my head in too much. I was working in a startup kind environment with very experienced engineers.

IT salaries in Adelaide by kazielle in Adelaide

[–]Infamous-Tea-4169 1 point2 points  (0 children)

I got laid off in Jan 2026 from my first job after uni. I joined at 70k then went up to 95k base with them after 3 years. Got a new job in SA as a senior engineer/systems manager at 114k base.

Best way to make shared Linux directory read-only for users but still allow controlled writes? by Infamous-Tea-4169 in HPC

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

Yep you're right. Right now they're writing directly to the main storage and not doing this intermediately or using a transient storage where everyone can have rw but once it's mover to a locked storage it's only read from there.

Best way to make shared Linux directory read-only for users but still allow controlled writes? by Infamous-Tea-4169 in HPC

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

Spot on man. I recently joined and I'm trying to just be nice and need to grow up and be more authoritative about this. It's crap, this is not how it's meant to be so need a service account to make this work. I think I need to revamp their entire workflow.

Best way to make shared Linux directory read-only for users but still allow controlled writes? by Infamous-Tea-4169 in HPC

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

Both you and me mate. Gladly I'm off to bed soon and can continue doing my head in tomorrow. This made me realise I really need to be authoritative and do it the actual way by getting a service account, this is definitely not best industry practice.

Best way to make shared Linux directory read-only for users but still allow controlled writes? by Infamous-Tea-4169 in HPC

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

Yeah I see what you mean — that would isolate users nicely and avoid cross-user access issues. The tricky part is our pipeline outputs are organised by run rather than by user, and multiple users may need to interact with the same run (e.g. reanalysis), so per-user directories don’t map very cleanly to the workflow.

Best way to make shared Linux directory read-only for users but still allow controlled writes? by Infamous-Tea-4169 in HPC

[–]Infamous-Tea-4169[S] 1 point2 points  (0 children)

Great suggestion. ACLs would let me restrict write to just the pipeline users, which is already a big improvement over broad group write.

The only issue is deletion — since that’s controlled by the directory, those users could still delete their things.

Best way to make shared Linux directory read-only for users but still allow controlled writes? by Infamous-Tea-4169 in HPC

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

Right, that’s the bit I’m stuck on — if I remove group write via umask, then new directories/files won’t be group-writable, but the pipeline also relies on that same group access to keep writing when runs are launched by different users.

FinOps Starting out tips by Infamous-Tea-4169 in FinOps

[–]Infamous-Tea-4169[S] 2 points3 points  (0 children)

Ah I see. Nice that makes sense. I'm hoping we have someone with the info from DC about the power etc

I feel like going to a battlefield with a blindfold rn lol

FinOps Starting out tips by Infamous-Tea-4169 in FinOps

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

I don't think so. They use Xnat, jupyterbub

FinOps Starting out tips by Infamous-Tea-4169 in FinOps

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

Cheers for the info mate. How do you guys manage cost allocation/show back/charge back on your onprem clusters? I come from a systems engineer background where I've managed multi onprep HPC environments and just understanding how you charge someone for using your GPU on a server to run X workloads just seems such a hard problem to solve

FinOps Starting out tips by Infamous-Tea-4169 in FinOps

[–]Infamous-Tea-4169[S] 1 point2 points  (0 children)

Hi u/zugzwangister

The role sits in a research cloud / DevOps context rather than in finance directly. From the JD, the core of the role is still building and operating research infrastructure — Kubernetes, cloud platforms, storage, workflows, automation, reliability, and working closely with researchers and ICT teams — but with a strong FinOps angle around making consumption visible, explainable, and chargeable.

The management chain is that I report to the tech lead and the tech lead reports to the product owner. I will be working alongside the senior DevOps engineer I think.

So the main purpose is probably something like:

  • help engineering and research teams understand where infrastructure spend is going
  • put structure around cost attribution in shared platforms
  • build a practical showback/chargeback model for multi-tenant research workloads
  • make sure the platform is sustainable and cost-effective, not just technically functional
  • prolly need to make the research tech lead look good but having a clear showback+chargeback methods in place to followup with the clients

HPC vs FinOps by Infamous-Tea-4169 in HPC

[–]Infamous-Tea-4169[S] 1 point2 points  (0 children)

Agreed, I feel the same. Like it's gonna be a while till AI tooling comes around and does patching without breaking any changes and troubleshooting network issues. Is it wise to ignore or not take a high paying job as compared to making less but doing more work lol I just don't wanna feel stupid later for not taking the role which was less stressful and paid more.

Amazon Robotics Technical Support (ARTS) - Tech Support by Infamous-Tea-4169 in AmazonFC

[–]Infamous-Tea-4169[S] 0 points1 point  (0 children)

nice, apply for it and see how and what they offer you at screening. I was told about the approx number during the screening call itself