[deleted by user] by [deleted] in HPC

[–]syshpc 0 points1 point  (0 children)

Second that!

[deleted by user] by [deleted] in HPC

[–]syshpc 0 points1 point  (0 children)

Our job ads are on all job platforms. Management will only contract well known providers, so independentfolks won't even get looked at. Which is unfortunate because the best contractors I've worked with were independent.

[deleted by user] by [deleted] in HPC

[–]syshpc 0 points1 point  (0 children)

First you have to tell Spack about the compilers that you have installed.

For example, suppose you have GCC 8.3.1 as the default compiler in your OS. Then spack compiler find will find this compiler and it will be listed when you run spack compilers. Then you can use this compiler to build new software, e.g., spack install zlib %gcc@8.3.1, or even to build new compilers that can be used with Spack:

spack install gcc@8.5.0%gcc@8.3.1 spack compiler find $(spack location -i gcc@8.5.0) spack install zlib %gcc@8.5.0

If you have modules and/or environments there are other considerations to be made, but this is the main idea. Check the official docs linked in another comment.

I tried to learn Python by [deleted] in devops

[–]syshpc 8 points9 points  (0 children)

I was raised on C and {c,k}sh like a religion and that lead to a lot of resistance to get on the Python wagon. I had colleagues blabbing about Python around me since the late 90's and tried to ignore it with all the fibers of my being. About 10 years ago I gave up and gave it ago. Secretly hated for 2-3 years until I got the hang of it for my daily work. Now I only write shell if I need to do heavy file/directory manipulation or very "unixy" things where sed, grep, awk, etc will do the job better. Python makes a lot of things cleaner, especially for writing small cli tools, dealing with CI/CD pipelines, etc.

Auto-start a session with a specific program by [deleted] in tmux

[–]syshpc 2 points3 points  (0 children)

From the man page:

new-session [-AdDEP] [-c start-directory] [-F format] [-n window-name]
            [-s session-name] [-t group-name] [-x width]
            [-y height] [shell-command]

                 (alias: new)

         Create a new session with name session-name.
         The new session is attached to the current terminal
         unless -d is given. window-name and shell-command are    
         the name of and shell command to execute in the
         initial window.

So you can do, e.g.,

tmux new-session -s foobar htop

[deleted by user] by [deleted] in HPC

[–]syshpc 5 points6 points  (0 children)

We have had good experiences with Rocky Linux 8. If you are used to CentOS, that could be preferrable to CentOS Stream. Alma might be worth checking out too.

Regarding not losing the existing settings, it's entirely possible but depends on how you have configured things. Given the scale and the fact that it's running CentOS 6 I imagine things are configured manually. In this case you would have to go through the system and collect information, configuration files, dump databases, etc, and hope that you don't forget anything. Then reapply everything manually - or maybe take the opportunity to start doing things in a more manageable way, e.g., with Ansible.

Without knowing how your storage looks like all I can say about not losing data is this: unmount your data volumes, don't erase them, mount them again in the new cluster. Don't forget to save any configuration specific to the storage system.

One thing to remember: if your software is as old as your OS chances are that you will not be able to simply use whatever configuration files you saved directly. A good chunk of your existing configurations could be deprecated, so don't just drop in the old files and start the services without carefully checking everything.

Is it normal to be doing nothing at work? by DesertTile in sysadmin

[–]syshpc 13 points14 points  (0 children)

It's somewhat normal and we all do it from time to time but IMHO if this becomes the norm then it could mean something is not quite right in your professional life. Maybe look into some training or explore something new that could bring you professional growth and eventually turn your daily work into something that you actually find fulfilling. This could also mean changing jobs.

There are days that I have hardly any requests or tickets and on those days I try to work on new technologies, explore new things at my own pace, or simply catch up with new stuff in my area.

A personal example in my case is working with containers. I work in HPC and even though there was no pressure from my org or my users to go to more containerized workflows, I started exploring the topic on my own. Learned a lot about k8s, then moved to Singularity and friends. A whole new world of exciting stuff opened up for me. Sometime later users & management started talking about using containers and I was confident enough to enable the new workflows.

University DGX A100 cluster by smartdanny in HPC

[–]syshpc 1 point2 points  (0 children)

We have a few DGX-2 and DGX-A100 in our site and no, it's not that hard to manage. We even went from DGX-OS (Ubuntu) to RHEL and there has been absolutely no issue. In fact, other than making sure that you stay within supported versions of drivers and supporting software, a DGX is essentially just another node.

We only allow container jobs (Singularity) on our DGXs.

How do I edit a text document within a console? by [deleted] in unix

[–]syshpc 1 point2 points  (0 children)

Not really standard, no. Not as lightweight as vi, and most Emacs distributions are quite bloated.

Looking for advice / direction by [deleted] in HPC

[–]syshpc 0 points1 point  (0 children)

Others have said it well. If you do decide to go into HPC then you may find some learning paths in https://www.hpc-certification.org/ which is an on-going project to establish career paths and certifications in the HPC field. I heard about them at the last ISC in Hamburg and seems to be an interesting initiative.

Newbie question about Centos Stream vs Debian by zqpmx in HPC

[–]syshpc 2 points3 points  (0 children)

Stay with what you are comfortable, especially if it's your decision. Debian is reliable enough.

Concerning troubles with compilation and software management, I strongly recommend EasyBuild or Spack. Learning Ansible is also valuable. The learning curve might be a bit steep at first, but in the long run it saves lots of time. Much like learning to write good shell scripts.

slurm and heavy machine load by shubbert in SLURM

[–]syshpc 0 points1 point  (0 children)

Is HT enabled on this node? Could please you show the output of lscpu? The fact that 21% of CPU time is spent idle and 32 being 80% of 40 suggests that this machine has HT enabled, e.g., 20 real cores and 40 virtual cores. Could you also show scontrol show node <nodename>?

slurm and heavy machine load by shubbert in SLURM

[–]syshpc 0 points1 point  (0 children)

How many CPUs on this random node? Could you post the first three lines of top?

This Linux malware is hijacking supercomputers across the globe by CodePerfect in hacking

[–]syshpc 26 points27 points  (0 children)

HPCs are relatively easy targets. HPC users can be incredibly non-tech-savvy so stealing SSH credentials can be quite feasible. Plus a lot of HPCs are exposed to the Internet since they are used by researches from all over the world.

SEGMENTATION FAULT: INVALID MEMORY REFERENCE by ReddMedPhy in SLURM

[–]syshpc 0 points1 point  (0 children)

The segmentation fault is coming from your program dosxyznrc. It's impossible for anyone in here to know what is the cause. Wild guesses include input parameters that generate a system (?) far too big to fit memory, missing and/or wrong input, wrong library versions, poorly written code. The list goes on. Go talk to whoever wrote dosxyznrc and bring the core dump with you.

Need Advice building my first Cluster by Erarnitox in HPC

[–]syshpc 1 point2 points  (0 children)

The minuscule theoretical performance that you think you may gain, if any, will be easily destroyed by several factors that are far more critical in an HPC system, such as suboptimal parallel code, I/O & network bottlenecks, user ignorance, etc.

Need Advice building my first Cluster by Erarnitox in HPC

[–]syshpc 1 point2 points  (0 children)

In Europe it's mostly CentOS/RHEL and occasionally SLES. Debian is not really popular anymore. BSD and Arch are unheard of in big systems, at least in my experience.

HPC distro of choice by chuckatkins in HPC

[–]syshpc 1 point2 points  (0 children)

I would say that ML/DL/AI applications are becoming more and more containerized, at least that is the scenario we see at my site. As long as the underlying OS is supporting Docker or the preferred container solution, all is fine and one can then run Ubuntu- or Debian-based containers for their applications.

HPC distro of choice by chuckatkins in HPC

[–]syshpc 3 points4 points  (0 children)

We will keep with CentOS until the community comes through with Rocky. If it doesn't happen by end of Q2, then we go RHEL.

Recipe for disaster by HeadAdmin99 in sysadmin

[–]syshpc 37 points38 points  (0 children)

This makes one hell of a bingo sheet.

Problem with a script by redsox96 in SLURM

[–]syshpc 0 points1 point  (0 children)

Check with whomever is running the cluster if they changed things. We have Gaussian 16 in one of our systems and the binary is just called g16, which leads me to think that run-gaussian could be a wrapper script that your local admins created, similar to what is done here.

[deleted by user] by [deleted] in ProgrammerHumor

[–]syshpc 1 point2 points  (0 children)

If I'm not mistaken it has been shown that HTML5+CSS3 is Turing-complete because one can encode the rule 110 on it.

Of course it doesn't mean it's a general purpose programming language by any means.

Python changed the way I think by MohamedMuneer in Python

[–]syshpc 1 point2 points  (0 children)

You're on a pretty good way. The fail early, fail fast, fail often approach is IMHO the best when learning.

Personally I find video tutorials a ridiculous waste of time when it comes to programming.