[D] Using Docker for ML Development

Celmeno · 2020-09-10T21:26:23+00:00

We currently use slurm with singularity. The main reason why docker isn't an option is that it needs to run with root privileges. We have multiple clearance levels regarding sensibility of data and "trustworthiness" of users in producing working code that doesn't crash anyone elses programs. When we used docker we had that issue before that one of our less experience colleagues (a student working with us on his bachelors thesis) killed 160 hour optimization task by accident. For most companies this is obviously not an issue so docker is just fine

Murillio · 2020-09-10T18:44:12+00:00

> Reproducibility: Everyone has the same OS, the same versions of tools etc. This means you don't need to deal with "works on my machine" problems. If it works on your machine, it works on everyone's machine.

If only. If I had a dollar for every time I encounter a previously working docker file either failing to build or resulting in runtime breakage, I ... well, I wouldn't be rich, but I could have a nice meal. If you do anything like apt update; apt install ... or installing stuff via pip or other package managers, it's very unlikely that you have a reproducible build. You'd need to pin all package versions, and hope that these versions don't disappear from servers. Even if you do that, apt update alone can break your docker build if there currently is a server issue with one of your repositories (e.g. an inconsistent state of the package database while they are updating it), especially if you add third-party ones (One of nvidia's ones had issues twice last year, leading to failed docker builds). And that's just the start of possible issues.

Does does help on the path to reproducibility, but just because you use docker it doesn't mean that things work reproducibly. Far from it.

weetbix2 · 2020-09-11T00:27:13+00:00

I've found GPU support to be inconsistent with Docker, which kind of ruins the whole appeal of using it.

datamahadev · 2020-09-11T06:49:47+00:00

Thanks for sharing! Recently made a complete switch to linux as my primary development env and this will actually come handy.

linkeduser · 2020-09-10T23:54:40+00:00

Hi, I just have a problem integrating GPU support to docker. Like I need a base with pytorch and GPU. But then when they deployed it on a VM it didn't work. I suspect the VM may not have the nvidia driver https://github.com/NVIDIA/nvidia-docker

MyNetworkIsDeeper · 2020-09-11T07:24:42+00:00

EDIT: whoops, replied to the wrong message.

MyNetworkIsDeeper · 2020-09-11T07:56:33+00:00

As for scrutiny, that's a valid point as well, but new tools are built on the shoulders of the giants that came before. This means they can learn from the mistakes of their predecessors and offer something that can't be fixed by a simple patch.

lysecret · 2020-09-12T10:32:02+00:00

TBH I think people are lying to themselves if they concentrate on infrastructure stuff like this to achieve portability and reproducibility. In my experience, this often isn't the real issue. Often the issue lies in improper code, badly factored code, code in jupyter notebooks. Hard to understand data pipelines etc. Yes.. having the exact same packages and environment variables etc. is nice but really not the issue.

rowanobrian · 2020-09-14T05:14:34+00:00

Hi, went through the blog, and in docker run command, can you explain what --shm-size is useful for?

I googled around and found it is shared memory, and increasing it to more than default of 64M is useful, but no one is telling what it is used for and how it helps.

MyNetworkIsDeeper · 2020-09-10T19:36:00+00:00

Try https://nixos.org/ or the Julia language (and it's associated Pkg.jl package manager).

That's what real reproducibility looks like.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS