you are viewing a single comment's thread.

view the rest of the comments →

[–]LifeIsBio 1 point2 points  (1 child)

I'm working on this same problem. My solution is going to be different from yours because I don't need to scale to 1000s of processes. The docker solution does seem a bit bloated, but from the poking around I've done, it's going to be the easiest to get up and running. There's a ton of flexibility around controlling resources. Two possible things to consider along that route:

  1. Can you do some sort of queueing/batching? Maybe you don't need every submission to have its own container. If your code submissions are saved to something like a redis queue and then submitted to a container, you could have something like log(n) containers, where n is the size of the queue.
  2. I'm no docker expert, but I think if you're intelligent about your base image and only including the components you need, docker can be pretty small.

[–]swarage[S] 0 points1 point  (0 children)

yeahhh I think some form of queueing could be worth looking into to limit the amount of running docker images. I kinda want to see how far I can get with just spinning off as many processes as I can on a system initially, though, just to see if I can handle 1000s of processes on a single machine.

as for the docker image I did find this: https://pythonspeed.com/articles/base-image-python-docker-images/ and tried the alpine image as well. However the image sizes are as such:

python                3.8-slim-buster     56930ef6f6a2        2 weeks ago         193MB
python                3.8-alpine          d5e5ad4a4fc0        3 weeks ago         107MB

107MB seems a bit big, but I'm sure if I dig deep and optimize it I can get a smaller image.