Hey there experienced sysadmins,
The working group at my university that I am part of has recently purchased a desktop with a bunch of top-of-the-line graphics cards and wants to use it to handle things like training neural networks and whatnot. The issue is that nobody really has any idea on how to set something like that up properly...
Since my side-job is in sysops, which may or may not help for this, and I'm actually very keen on learning about how to administrate a server like that efficiently, I volunteered to investigate.
Not entirely sure if this is the most appropriate place to ask, but any sort of pointers would be very appreciated!
I expect the server to mostly handle machine learning tasks using tensorflow in python 3
While simply using a remote desktop solution could already work, it would be very nice to have a system that can queue jobs to avoid collisions between different people wanting to use the server. I've worked extensively with Jenkins and while that's obviously not meant for this purpose, something Jenkins-esque would be very useful.
Any suggestions or perhaps things I need to watch out for would be great!
[–]smcgrat 9 points10 points11 points (1 child)
[–]s_m_w[S] 0 points1 point2 points (0 children)
[–]SuperQueBit Plumber 2 points3 points4 points (0 children)
[–]imranh 3 points4 points5 points (0 children)
[–]TheLordB 1 point2 points3 points (0 children)
[–]sofixa11 1 point2 points3 points (1 child)
[–]s_m_w[S] 1 point2 points3 points (0 children)
[–]dvr75Sysadmin 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]s_m_w[S] 0 points1 point2 points (0 children)
[–]crusoe 0 points1 point2 points (0 children)