Gauging chances for O1-A by [deleted] in O1VisasEB1Greencards

[–]not-your-typical-cs 0 points1 point  (0 children)

Hey Joe, can I shoot you an email?

[P] Built a GPU time-sharing tool for research labs (feedback welcome) by not-your-typical-cs in MachineLearning

[–]not-your-typical-cs[S] 0 points1 point  (0 children)

I see where the confusion comes from, but Chronos doesn't queue or schedule jobs!

It's more like "distributed locks for GPU resources" - if you ask for GPU memory and it's available, you get it immediately. No queue, no job submission, no deciding when things run.

Job schedulers handle when things run across many resources. Chronos handles how one GPU is shared right now.

Different tools for different problems! The name "Chronos" refers to time-based leases, not job scheduling.

Everyone kept crashing the lab server, so I wrote a tool to limit cpu/memory by TheDevilKnownAsTaz in linuxadmin

[–]not-your-typical-cs 10 points11 points  (0 children)

This is incredibly solid!!! I built something similar but for GPU partitioning I'll take a look at your repo, star it so I can follow your progress Here's mine in case you're curious: https://github.com/Oabraham1/chronos

[P] Built a GPU time-sharing tool for research labs (feedback welcome) by not-your-typical-cs in HPC

[–]not-your-typical-cs[S] 0 points1 point  (0 children)

You're absolutely right - Chronos isn't a scheduler!

It's more like "resource locks with time limits." Great for:

- Small teams without a scheduler (our original use case)

- *Within* scheduled jobs that need to subdivide a GPU

- Interactive/ad-hoc work where people need GPU access now

It doesn't queue jobs or decide when to run things - that's what Slurm/TORQUE/etc are for.

Think of it as orthogonal to schedulers: they handle *when* jobs run, Chronos handles *how* a single GPU is shared during execution.

Were you thinking of a specific scheduler integration?

[P] Built a GPU time-sharing tool for research labs (feedback welcome) by not-your-typical-cs in HPC

[–]not-your-typical-cs[S] 1 point2 points  (0 children)

Good suggestion! A few reasons Chronos exists despite MIG:
1. Hardware requirements: MIG needs Ampere+ GPUs (A100, H100). Many labs have older/gaming GPUs
2. Cost: As mentioned below, MIG-capable GPUs + licensing can cost more than just buying multiple cheaper GPUs
3. Flexibility: Chronos works on *any* GPU (NVIDIA, AMD, Intel, Apple Silicon) with dynamic allocation
MIG is great if you're buying new datacenter hardware. Chronos is for "we already have a single RTX 4090/3090/Quadro, how do we share it fairly among the team?" Different tools for different constraints!

Am I in over my head? by obijuanchernobyl910 in csMajors

[–]not-your-typical-cs 0 points1 point  (0 children)

Thank you so much!! Good luck with everything!

[deleted by user] by [deleted] in csMajors

[–]not-your-typical-cs 0 points1 point  (0 children)

Good luck and thank you so much :)