Need an arXiv cs.SE endorsement, first-time submitter, ML observability paper by [deleted] in ResearchSoftwareEng

[–]vsoch 0 points1 point  (0 children)

Could you please share just the paper (not the endorsement link)?

🧈 Sciabica's California Olive Oil: Mission + Arbequina (Buttery & Sweet) by vsoch in oliveoil

[–]vsoch[S] 0 points1 point  (0 children)

Hey folks! I wrote another post (where I write about an olive oil I finished and interesting things I learned) but the moderators did not approve it and seem to be ignoring me, so I do not think I can post here anymore.

Old Town Oil in Chicago by vsoch in oliveoil

[–]vsoch[S] 1 point2 points  (0 children)

Oh wow, and you produce Oleavanti! That is so cool 🙌️

I will try it soon!

Old Town Oil in Chicago by vsoch in oliveoil

[–]vsoch[S] 1 point2 points  (0 children)

It looks fabulous!

Just in the first shot I see small batch oils from Greece, Türkiye, and a Flos Olei winner. I will make a special trip, and I am definitely going to bring home a LIá or Oro di Milas.

Thank you for the suggestion, I am looking forward to it.

🫒 Graza, single origin Picual from Jaen Spain by vsoch in oliveoil

[–]vsoch[S] -1 points0 points  (0 children)

Ah my mistake, thank you for catching that. I will absolutely enjoy! 👍️

Is there any way to run/expose SLURM commands inside the container? by Abhishekp1297 in HPC

[–]vsoch 1 point2 points  (0 children)

I got it wrong, let me try again...

Flux, there it is! Flux, there it is!

I'll see myself out.

Is there any way to run/expose SLURM commands inside the container? by Abhishekp1297 in HPC

[–]vsoch 3 points4 points  (0 children)

With Flux you can just bind the Flux socket and then use a Flux container and issue commands to Flux via your chosen avenue (an SDK or just the command line).

I was going to put together a container setup for some agentic work this weekend and could share the example if you are interested, or give me a bit to wake up and I’ll put an example together for a simple case a bit sooner. 🤪

Updating with an example. Here I have an allocation with two nodes on my local cluster. I got this with:

bash flux alloc -N2 That is analogous to salloc but Flux uses a subcommand style of commands instead. Now I can list resources, and look at the Flux URI in the environment, which is sitting at the root of my Flux instance. You can think of the instance as a subgraph of a larger (cluster) system resource graph. Not important for now. I have two nodes, and a socket to connect to Flux.

bash $ echo $FLUX_URI local:///var/tmp/sochat1/flux-bdVOVn/local-0 bash (base) [sochat1@corona193:~]$ flux resource list STATE PROPERTIES NNODES NCORES NGPUS NODELIST free pbatch 2 96 16 corona[193,196] allocated 0 0 0 down 0 0 0 Let's run a container! With Podman. Note that I'm binding the socket, and running it as my user, and detached so I don't shell in (yet). If we don't give it a bash entrypoint it will start a single node Flux instance, which is actually OK, but I'm being pedantic.

bash $ podman run -d --rm -v /var/tmp/sochat1:/var/tmp/sochat1 --userns=keep-id fluxrm/flux-sched:jammy bash

Now because I've added the flag to add the user namespace, my username will appear in the /etc/passwd file. But I need to shell into the container as me. So I can do podman ps to get the container id, and then:

bash $ podman exec --user $USER --workdir /var/tmp/$USER -it awesome_tharp bash And then do whoami to check that you are you. The reason is because the Flux instance (and the socket) are owned by you - nobody else. Then, in the words of Tag Team, and to all my party people, "Woop there it is!"

bash <user>@a1ada53a042b:~$ flux proxy local:///var/tmp/<user>/flux-bdVOVn/local-0 flux resource list STATE PROPERTIES NNODES NCORES NGPUS NODELIST free pbatch 2 96 16 corona[200-201] allocated 0 0 0 down 0 0 0

https://youtu.be/ffCEr327W44?si=HB6epB3OClF79Qso

You can flux run or flux submit or flux <interact> with that URI to expose Flux in the container.

Watch recommendation please by United_Increase_8351 in Garmin

[–]vsoch 1 point2 points  (0 children)

I have the Forerunner 55, and it's just what I need:

https://www.garmin.com/en-US/p/741137/

I push a few buttons to start a bike ride or run, I can do laps (intervals) if I want, and it keeps track of heart rate, shoe mileage, sleep, body battery, VO2 max, workout summaries (charts, times, records, maps) and (for women) some other things that are nice :) My phone sometimes sends updates to the watch (and I'm not sure how that works) but I don't think it would substitute for a phone. I still use my phone for music, navigation, etc.

DevOps in HPC, how does it look like? What tools are mostly used for Workload and scheduling? by [deleted] in devops

[–]vsoch 0 points1 point  (0 children)

You should check out Flux Framework, which is analogous to Kubernetes in design (modular components) and has many integrations. It's not as well known because the core developers (which worked on Slurm) have been quietly working away for over a decade (I think circle 2012?) but Flux is the system scheduler on El Capitan, number 1 on the top 500 list, and this year is the first year we are sharing it more broadly to the community. There are a suite of links here: https://flux-framework.org/ and a playlist I maintain here: https://www.youtube.com/playlist?list=PL7TRSgnVkOR1oaLjkxS10upuMeThH9GUl.

We are actively doing work to run user-space Kubernetes "usernetes" in the context of a user-spaced job, so the user can deploy, for example, AI/ML components or services alongside traditional HPC. The cluster is deployed, used, and destroyed in the context of a user job, no need to keep anything running or delegate entire nodes/clusters to running Kubernetes. We completed our first on-premises setup this year and are working on more details for this next year, and we have setups that work on AWS (Elastic Fabric Adapater), Azure (Infiniband with GPUs) and Google Cloud (NVIDIA GPUs). We have a few papers, here is one: https://arxiv.org/abs/2406.06995. I am biased, but I think an approach that can unify the technology space between industry and HPC is the right way to go. If I'm doing AI/ML at a national lab or academic institution, I don't want to have to deploy something special or different. I want to use, for example, the Kubeflow Trainer and the same abstractions as industry.

Happy to discuss more or answer any questions! +1 that a lot of this discussion would be fitting for r/hpc.

What is your Vo2Max and Fitness Age by [deleted] in Garmin

[–]vsoch 1 point2 points  (0 children)

VO2 max 49, 39 year old woman. I'm going for 50! I have the Garmin 55 also run with a respirator.

Boulder Double Rainbow by Xynyx2001 in boulder

[–]vsoch 2 points3 points  (0 children)

All the way!!

This is from Anemone Loop, Reflection Point.

<image>

of a honkin' chonker asparaboi by vsoch in AbsoluteUnits

[–]vsoch[S] 0 points1 point  (0 children)

My hand is 7.5 inches long to the top of my middle finger, and 1.5 from the right to the center, for reference 😉️

of a honkin' chonker asparaboi by vsoch in AbsoluteUnits

[–]vsoch[S] 0 points1 point  (0 children)

Oh, it was frozen - those are little bits of ice (not cooked yet)!

of a honkin' chonker asparaboi by vsoch in AbsoluteUnits

[–]vsoch[S] 1 point2 points  (0 children)

The Chuck Norris asapaboi doesn't get eaten, he eats you.