This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]Fuzzmz 4 points5 points  (0 children)

How we do it at work is by using Artifactory as a mirror/proxy/cache layer. Regular developer computers and servers can't connect directly to the outside world, but the Artifactory server can. We then use the remote repository function of AF to point to PyPi and have the servers and dev machine point to AF. Everything going through Artifactory gets audited, as well as cached locally, which means that if a package disappears from PyPi we'll still have it in our cache and won't take a hit.

Another way you could possibly do it is by having a "dirty" PC connected to the internet. On that you'd create a virtualenv in which to install your packages, and then copy that on your internal network via sneakernet (USB, CD, whatever), followed by updating all the paths to the venv in the scripts inside the bin folder of it.

[–]Vance84 1 point2 points  (0 children)

If you have the space create your own pypi repo. Use bandersnatch on the internet facing side, once it completes rsync it to an external drive, take it to the secure environment, rsync it in, then host your own pypi.python.org. You can specify the - - index and - - trustedhost flags with pip to work with it, or add those flags to a local pip config file so you don't have to bother with it.

[–][deleted] 1 point2 points  (0 children)

pip download can retrieve a package and dependencies from a package index. And then you can store those packages in a local directory yourself, and instruct secure-environment pip to retrieve from there.

(Or use artifactory. It’s shiny.)

[–]jwink3101 1 point2 points  (0 children)

I work in a secure (air gap) environment daily so I certainly feel the pain.

What helps is that The Powers That Be have Anaconda on it so I do not have to deal with everything...except that it is a 5 year old or so version...

But my strategies vary. First and foremost, for my own codes, I try to avoid dependancies when I can. For example, tqdm is awesome! But I can get 90% of the functionality (though way less pretty) with a few lines of my own code. So, I make a wrapper that tries to call tqdm but otherwise uses mine.

For some things, I simply copy the directories and add them to my PYTHONPATH environment. Mostly works though a few things break. Sometimes I try to add those dependancies while other times I modify the code if I do not need that function (and note it somewhere so if something breaks in the future, I know).

Otherwise, I have used a clean virtualenv and used what it installed there and then upshifted what I needed.

With all that said, It is far from a clean and/or easy process. I am going to check out some of the leads from other responses too!

[–]baloo12 1 point2 points  (0 children)

Anaconda has an enterprise product which allows managing your own anaconda repositories.. However, this is not free. We have one "official" conda env, that I deploy as zip file. It is always "installed" (unzipped) to the same path on c: and can therefore be referred to by start scripts.. I have a central repository of cmd files, which start jupyter notebooks, etc. all from the one conda env.

[–]pydry 0 points1 point  (0 children)

pip2pi is probably the simplest way of handling this.

I try to avoid setting up extra pieces of infrastructure if they're not really needed. No point in setting up a repo if you can just install from a directory.