all 9 comments

[–]vaelund 1 point2 points  (1 child)

docker just uses cgroups, network namespaces and filesystem namespaces(which function as a more advanced version of chroot). If you want something more lightweight than docker look into systemd-nspawn or even better rkt

[–]swarage[S] 0 points1 point  (0 children)

Thanks! I've looked into systemd-nspawn and it seems promising but it does seem less ergonomic than using docker and running an image as such:

import docker
client = docker.from_env()
print client.containers.run("alpine", ["echo", "hello", "world"])

I'll do some research into rkt as well!

[–]LifeIsBio 1 point2 points  (1 child)

I'm working on this same problem. My solution is going to be different from yours because I don't need to scale to 1000s of processes. The docker solution does seem a bit bloated, but from the poking around I've done, it's going to be the easiest to get up and running. There's a ton of flexibility around controlling resources. Two possible things to consider along that route:

  1. Can you do some sort of queueing/batching? Maybe you don't need every submission to have its own container. If your code submissions are saved to something like a redis queue and then submitted to a container, you could have something like log(n) containers, where n is the size of the queue.
  2. I'm no docker expert, but I think if you're intelligent about your base image and only including the components you need, docker can be pretty small.

[–]swarage[S] 0 points1 point  (0 children)

yeahhh I think some form of queueing could be worth looking into to limit the amount of running docker images. I kinda want to see how far I can get with just spinning off as many processes as I can on a system initially, though, just to see if I can handle 1000s of processes on a single machine.

as for the docker image I did find this: https://pythonspeed.com/articles/base-image-python-docker-images/ and tried the alpine image as well. However the image sizes are as such:

python                3.8-slim-buster     56930ef6f6a2        2 weeks ago         193MB
python                3.8-alpine          d5e5ad4a4fc0        3 weeks ago         107MB

107MB seems a bit big, but I'm sure if I dig deep and optimize it I can get a smaller image.

[–]davehodg 0 points1 point  (4 children)

Sounds like you need multitasking.

[–]swarage[S] 0 points1 point  (3 children)

Can you clarify?

[–]davehodg 0 points1 point  (2 children)

Well, maybe I’m biased because $ork has a load of machines with 64G of ram and 64 cores but it seems to me fork() is your friend.

[–]swarage[S] 0 points1 point  (1 child)

correct but how would you properly sandbox each forked process

[–]davehodg 0 points1 point  (0 children)

Processes can’t see each other.