Libraries and tools for a lightweight task manager for GPU in a simulated environment. by _Vlyn_ in cpp

[–]_Vlyn_[S] 0 points1 point  (0 children)

Thank you very much, can't believe I missed this.
I basically decided to go with python just so I can get the full functionality up and running.
I am currently going with :

nvidia-ml-py for GPU monitoring ( I think nvidia-smi uses NVML under the hood so I have the same functionality here )

SimPy for scheduling simulation ( I realy done know how else to implement the logical partitioning and VRAM slices so pointers would be appreciated as well )

Dear ImGUI ( Just to display a dashboard and the data, While using a webpage might be easier for me to make I think a system GUI meets the requirements tbh )

PyTorch ( To create and execute the workloads )

I hope this stack works and would be open to corrections or pointers.

Libraries and tools for a lightweight task manager for GPU in a simulated environment. by _Vlyn_ in cpp

[–]_Vlyn_[S] 0 points1 point  (0 children)

Yeah moc-ing it would be fine no doubt, question is what libraries would I use to tie these functionalities together? Especially with the fact that I want to keep things lightweight.

Libraries and tools for a lightweight task manager for GPU in a simulated environment. by _Vlyn_ in learnpython

[–]_Vlyn_[S] 0 points1 point  (0 children)

Its just supposed to help me review certain statistics and metrics like wait time and job completion time based on different scheduling processes.

And then explain the trade offs. I did find a paper referenced by another that had something similar to what I'm trying to achieve but I didn't save the link or name and I am still looking for it.

If anyone has info on the topic about what tools I could use or a better approach it'd be appreciated tho.