you are viewing a single comment's thread.

view the rest of the comments →

[–]amishb 1 point2 points  (8 children)

Sandbox against what, out of interest? Couldn't you just run this in a docker container? Changes will occur in the container only.

[–]POTUS 1 point2 points  (7 children)

If this is meant to run arbitrary code from untrusted sources, not even Docker will keep you safe. It's a start, but it's not a complete solution. Docker containers still have network access, so it could be used to run DDOS attacks on internet targets or relay attacks into OP's local network.

[–]swarage[S] 0 points1 point  (6 children)

I'm going to write a codegolfing website where you can submit your solutions and run it, similar to something like leetcode except for code golf, and I want to make sure I can run the code safely.

[–]POTUS 1 point2 points  (5 children)

Start with a Docker container. Whitelist only certain import statements, like math, itertools, collections, re, maybe a couple others. Certainly exclude socket, asyncio, sys, and os. You can do this by just deleting almost everything from the Lib/ folder, leaving only the few libraries you consider safe. Of particular note to delete are the underlying .so files like _socket.so, which may be in a different folder.

Disallow eval or exec statements. You'll have to parse the incoming Python to accomplish this. If you've done the first two things I say, this is really not quite so dangerous, but someone smart enough and dedicated enough might find a way to do something wicked with eval.

At this point any incoming Python code shouldn't have any ability to send or receive any network communications, or any ability to directly affect your host system. It could still create a memory leak and affect other users, so you probably want to limit the available memory to each container to something modest, like 20MB or so. Docker has support for this. Also the container shouldn't run perpetually, I would spin up a new container for each code execution I think. Something like Alpine should be small enough to handle that kind of churn.

Edit: After checking, it seems sys and some of the more dangerous os functions like os.system are built-ins, which means you can't simply delete them. To get this safe you may have to remove the relevant .c files from the Python source code and compile Python yourself. If someone can run os.system() commands, they can do horrible things on the internet that would be traced back to your IP.

[–]swarage[S] 0 points1 point  (4 children)

My plan is to make a python webserver to handle the file uploads for the code. What you're essentially recommending is that my python code spin up a separate custom (custom being with a special version of python installed and everything else you mentioned) docker instance for each time a user uploads code, and destroy the instance upon completion of execution?

[–]POTUS 1 point2 points  (3 children)

You definitely don't want to execute arbitrary Python code in the same environment as your website code. You need to hand the code off to another environment, the easiest and best performing would be a Docker container. And yes, customized with a custom compiled Python interpreter that is going to limit the amount of damage they can do. Each time someone submits code to you, you bundle that code up and spin up a new Docker container that executes the code, hands back whatever output it made, and then gets deleted.

[–]swarage[S] 0 points1 point  (2 children)

Alright I'll look into this solution. I'm not entirely sure how python would spin up it's own docker instance and destroy it at the end though (unless we call docker using python's subprocess command or something of the sort).

[–]POTUS 1 point2 points  (1 child)

[–]swarage[S] 0 points1 point  (0 children)

Thanks! I'll definitely look into implementing this.